bsoup

package

v0.5.0 Latest Latest Go to latest Published: Apr 30, 2021 License: MIT Imports: 9 Imported by: 4

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/qri-io/starlib

Links

Open Source Insights

README ¶

bsoup

bsoup defines a beautiful-soup-like API for working with HTML documents

Functions

`parseHtml(html string) SoupNode`

parseHtml parses html from a string, returning the root SoupNode

Types

`SoupNode`

Methods

`find(name, attrs, recursive, string, **kwargs)`

retrieve the first occurance of an element that matches arguments passed to find. works similarly to node.find()

`find_all(name, attrs, recursive, string, limit, **kwargs)`

retrieves all descendants that match arguments passed to find_all. works similarly to node.find_all()

`attrs()`

get a dictionary of element attributes works similarly to node.attrs

`contents()`

gets the list of children of an element works similarly to soup.contents

`child()`

gets a single child element with the given tag name works like accessing a node using its tag name

`parent()`

gets the parent node of an element works like node.parent

`next_sibling()`

gets the next sibling of an element works like node.next_sibling

`prev_sibling()`

gets the previous sibling of an element works like node.prev_sibling

`get_text()`

all the text in a document or beneath a tag, as a single Unicode string: works like soup.get_text

Documentation ¶

Overview ¶

Package bsoup defines a beautiful-soup-like API for working with HTML documents in starlark

outline: bsoup
  bsoup defines a beautiful-soup-like API for working with HTML documents
  path: bsoup
  types:
    SoupNode
      methods:
        find(name, attrs, recursive, string, **kwargs)
          retrieve the first occurance of an element that matches arguments passed to find.
          works similarly to [node.find()](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find)
        find_all(name, attrs, recursive, string, limit, **kwargs)
          retrieves all descendants that match arguments passed to find_all.
          works similarly to [node.find_all()](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find-all)
        attrs()
          get a dictionary of element attributes
          works similarly to [node.attrs](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#attributes)
        contents()
          gets the list of children of an element
          works similarly to [soup.contents](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#contents-and-children)
        child()
          gets a single child element with the given tag name
          works like accessing a node [using its tag name](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#navigating-using-tag-names)
        parent()
          gets the parent node of an element
          works like [node.parent](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#parent)
        next_sibling()
          gets the next sibling of an element
          works like [node.next_sibling](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#next-sibling-and-previous-sibling)
        prev_sibling()
          gets the previous sibling of an element
          works like [node.prev_sibling](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#next-sibling-and-previous-sibling)
        get_text()
          all the text in a document or beneath a tag, as a single Unicode string:
          works like [soup.get_text](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#get-text)
  functions:
    parseHtml(html string) SoupNode
      parseHtml parses html from a string, returning the root SoupNode

Index ¶

Constants
func AsString(x starlark.Value) (string, error)
func LoadModule() (starlark.StringDict, error)
func NewSoupNode(root *soup.Root) starlark.Value
func ParseHTML(thread *starlark.Thread, _ *starlark.Builtin, args starlark.Tuple, ...) (starlark.Value, error)
type SoupNode

Constants ¶

View Source

const ModuleName = "bsoup.star"

ModuleName defines the name for loading this module, using `load('bsoup.star', 'bsoup')

Variables ¶

This section is empty.

Functions ¶

func AsString ¶

func AsString(x starlark.Value) (string, error)

AsString converts a starlark Value into a string, with outer quotes trimmed

func LoadModule ¶

func LoadModule() (starlark.StringDict, error)

LoadModule loads the bsoup module. Concurrency-safe and idempotent.

func NewSoupNode ¶

func NewSoupNode(root *soup.Root) starlark.Value

NewSoupNode constructs a new SoupNode by cloning each field from the soup.Root

func ParseHTML ¶

func ParseHTML(thread *starlark.Thread, _ *starlark.Builtin, args starlark.Tuple, kwargs []starlark.Tuple) (starlark.Value, error)

ParseHTML parses html from a string, and returns it as a SoupNode

Types ¶

type SoupNode ¶

type SoupNode soup.Root

SoupNode extends soup's Root struct with starlark support

func (*SoupNode) Attr ¶

func (n *SoupNode) Attr(name string) (starlark.Value, error)

Attr returns an attribute of a SoupNode

func (*SoupNode) AttrNames ¶

func (n *SoupNode) AttrNames() []string

AttrNames returns all attributes of a SoupNode

func (*SoupNode) Freeze ¶

func (n *SoupNode) Freeze()

Freeze freezes a SoupNode struct, which is already immutable

func (*SoupNode) Hash ¶

func (n *SoupNode) Hash() (uint32, error)

Hash calculates a hash of a SoupNode

func (*SoupNode) String ¶

func (n *SoupNode) String() string

String converts a SoupNode to a string by rendering each node

func (*SoupNode) Truth ¶

func (n *SoupNode) Truth() starlark.Bool

Truth returns whether a SoupNode is non-nil

func (*SoupNode) Type ¶

func (n *SoupNode) Type() string

Type returns the type of SoupNode as string

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL