GABAHTMLParser

package module
v0.0.0-...-e7b449c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 31, 2021 License: MIT Imports: 8 Imported by: 0

README

GABAHTMLParser

A Simple HTML Parser.

Installation

go get github.com/HarrisonKawagoe3960X/GABAHTMLParser

Usage

Import the GABAHTMLParser

Just add "github.com/HarrisonKawagoe3960X/GABAHTMLParser" into import like this:

package main

import(
	"fmt"
	"github.com/HarrisonKawagoe3960X/GABAHTMLParser" //Add this
)

func main() {
	htmlobject := GABAHTMLParser.GetHTMLfromURL("someurl",false)
	results := htmlobject.Find("tag = 'a'")
	for _ , result := range results{
		fmt.Println(result.InnerHTML)
	}
	
}
Parse HTML from URL or path
htmlobject := GABAHTMLParser.GetHTMLfromURL("someurl",false)

if you parse the source of site that use Shift-JIS encoding, change false to true

htmlobject := GABAHTMLParser.GetHTMLfromURL("someurl",true)
Parse HTML from String Array
htmlobject := GABAHTMLParser.ParseHTML(strarray)
Element Object

After parsing the HTML, you can extract the data by calling Element.

  • InnerHTML: HTML code under the current HTML Element.
  • Tag: Tag name of current HTML Element.
  • Child: Child Objects of current HTML Element.
  • Parent: Parent Object of current HTML Element.
  • Attr: Properties of current HTML Element.
Search for HTML Object
results := htmlobject.Find("tag = 'a'")

you can combine conditions by using &&

results := htmlobject.Find("tag = 'a' && class = 'hanshin'")

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IsSpecialTag

func IsSpecialTag(category string) bool

func LinesFromReader

func LinesFromReader(r io.Reader, isEncode bool) ([]string, error)

func Split

func Split(r rune) bool

Types

type Element

type Element struct {
	Parent    *Element
	Child     []*Element
	Tag       string
	InnerHTML string
	Attr      map[string]string
}

func GetHTMLfromURL

func GetHTMLfromURL(path string, isEncode bool) *Element

func ParseHTML

func ParseHTML(lines []string) *Element

func (*Element) Find

func (element *Element) Find(condition string) []*Element

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL