bsoup

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2021 License: MIT Imports: 9 Imported by: 2

README

bsoup

bsoup defines a beautiful-soup-like API for working with HTML documents

Functions

parseHtml(html string) SoupNode

parseHtml parses html from a string, returning the root SoupNode

Types

SoupNode

Methods

find(name, attrs, recursive, string, **kwargs)

retrieve the first occurance of an element that matches arguments passed to find. works similarly to node.find()

find_all(name, attrs, recursive, string, limit, **kwargs)

retrieves all descendants that match arguments passed to find_all. works similarly to node.find_all()

attrs()

get a dictionary of element attributes works similarly to node.attrs

contents()

gets the list of children of an element works similarly to soup.contents

child()

gets a single child element with the given tag name works like accessing a node using its tag name

parent()

gets the parent node of an element works like node.parent

next_sibling()

gets the next sibling of an element works like node.next_sibling

prev_sibling()

gets the previous sibling of an element works like node.prev_sibling

get_text()

all the text in a document or beneath a tag, as a single Unicode string: works like soup.get_text

Documentation

Overview

Package bsoup defines a beautiful-soup-like API for working with HTML documents in starlark

outline: bsoup
  bsoup defines a beautiful-soup-like API for working with HTML documents
  path: bsoup
  types:
    SoupNode
      methods:
        find(name, attrs, recursive, string, **kwargs)
          retrieve the first occurance of an element that matches arguments passed to find.
          works similarly to [node.find()](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find)
        find_all(name, attrs, recursive, string, limit, **kwargs)
          retrieves all descendants that match arguments passed to find_all.
          works similarly to [node.find_all()](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find-all)
        attrs()
          get a dictionary of element attributes
          works similarly to [node.attrs](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#attributes)
        contents()
          gets the list of children of an element
          works similarly to [soup.contents](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#contents-and-children)
        child()
          gets a single child element with the given tag name
          works like accessing a node [using its tag name](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#navigating-using-tag-names)
        parent()
          gets the parent node of an element
          works like [node.parent](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#parent)
        next_sibling()
          gets the next sibling of an element
          works like [node.next_sibling](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#next-sibling-and-previous-sibling)
        prev_sibling()
          gets the previous sibling of an element
          works like [node.prev_sibling](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#next-sibling-and-previous-sibling)
        get_text()
          all the text in a document or beneath a tag, as a single Unicode string:
          works like [soup.get_text](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#get-text)
  functions:
    parseHtml(html string) SoupNode
      parseHtml parses html from a string, returning the root SoupNode

Index

Constants

View Source
const ModuleName = "bsoup.star"

ModuleName defines the name for loading this module, using `load('bsoup.star', 'bsoup')

Variables

This section is empty.

Functions

func AsString

func AsString(x starlark.Value) (string, error)

AsString converts a starlark Value into a string, with outer quotes trimmed

func LoadModule

func LoadModule() (starlark.StringDict, error)

LoadModule loads the bsoup module. Concurrency-safe and idempotent.

func NewSoupNode

func NewSoupNode(root *soup.Root) starlark.Value

NewSoupNode constructs a new SoupNode by cloning each field from the soup.Root

func ParseHTML

func ParseHTML(thread *starlark.Thread, _ *starlark.Builtin, args starlark.Tuple, kwargs []starlark.Tuple) (starlark.Value, error)

ParseHTML parses html from a string, and returns it as a SoupNode

Types

type SoupNode

type SoupNode soup.Root

SoupNode extends soup's Root struct with starlark support

func (*SoupNode) Attr

func (n *SoupNode) Attr(name string) (starlark.Value, error)

Attr returns an attribute of a SoupNode

func (*SoupNode) AttrNames

func (n *SoupNode) AttrNames() []string

AttrNames returns all attributes of a SoupNode

func (*SoupNode) Freeze

func (n *SoupNode) Freeze()

Freeze freezes a SoupNode struct, which is already immutable

func (*SoupNode) Hash

func (n *SoupNode) Hash() (uint32, error)

Hash calculates a hash of a SoupNode

func (*SoupNode) String

func (n *SoupNode) String() string

String converts a SoupNode to a string by rendering each node

func (*SoupNode) Truth

func (n *SoupNode) Truth() starlark.Bool

Truth returns whether a SoupNode is non-nil

func (*SoupNode) Type

func (n *SoupNode) Type() string

Type returns the type of SoupNode as string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL