extract

package
v0.0.0-...-71fd032 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 25, 2021 License: BSD-3-Clause Imports: 5 Imported by: 0

Documentation

Overview

Package extract provides a set of routines for extracting various Twitter "entities" from text

This package supports extraction of Twitter usernames, replies, lists, hashtags, cashtags, and URLs. The implementation and API are based the set of similarly named twitter-text-* libraries published by Twitter. This library is tested using the standard Conformance test suite maintained by Twitter (https://github.com/twitter/twitter-text-conformance).

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type EntityType

type EntityType int

Specifies the type of a given entity

const (
	MENTION EntityType = iota
	HASH_TAG
	CASH_TAG
	URL
)

func (EntityType) String

func (t EntityType) String() string

Implement the Stringer interface

type Range

type Range struct {
	Start int
	Stop  int
}

Used to specify offsets in a string where entities are located

func (*Range) Length

func (r *Range) Length() int

Returns the length of the range

func (Range) String

func (r Range) String() string

Implement the Stringer interface

type TwitterEntity

type TwitterEntity struct {
	Text      string // The value of the entity
	Range     Range  // Represents the location of the entity in character/rune offsets
	ByteRange Range  // Represents the location of the entity in byte offsets
	Type      EntityType
	// contains filtered or unexported fields
}

Structure representing an "entity" within a string.

func ExtractCashtags

func ExtractCashtags(text string) []*TwitterEntity

Extracts $cashtag occurrences from the supplied text. Returns a slice of TwitterEntity struct pointers.

The Cashtag field of the returned entities will contain the value of the extracted cashtag without the leading $ character

func ExtractEntities

func ExtractEntities(text string) []*TwitterEntity

Extract all usernames, lists, hashtags, and URLs from the given text - returned in the order they appear within the input string

Example
text := "tweet mentioning @username with a url http://t.co/abcde and a #hashtag"
entities := ExtractEntities(text)

for _, e := range entities {
	fmt.Printf("Entity:%s Type:%v\n", e.Text, e.Type)
}
Output:

Entity:@username Type:MENTION
Entity:http://t.co/abcde Type:URL
Entity:#hashtag Type:HASH_TAG

func ExtractHashtags

func ExtractHashtags(text string) []*TwitterEntity

Extracts #hashtag occurrences from the supplied text. Returns a slice of TwitterEntity struct pointers.

The Hashtag field of the returned entities will contain the value of the extracted hashtag without the leading # character

func ExtractMentionedScreenNames

func ExtractMentionedScreenNames(text string) []*TwitterEntity

Extracts @username mentions from the supplied text. Returns a slice of TwitterEntity struct pointers.

The ScreenName field in the returned structs will contain the value of the referenced username without the leading @ sign

Example
text := "mention @user1 @user2 and @user3"
entities := ExtractMentionedScreenNames(text)
for i, e := range entities {
	sn, _ := e.ScreenName()
	fmt.Printf("Match[%d]:%s Screenname:%s Range:%s\n", i, e.Text, sn, e.Range)
}
Output:

Match[0]:@user1 Screenname:user1 Range:(8, 14)
Match[1]:@user2 Screenname:user2 Range:(15, 21)
Match[2]:@user3 Screenname:user3 Range:(26, 32)

func ExtractMentionsOrLists

func ExtractMentionsOrLists(text string) []*TwitterEntity

Extracts @username mentions or list names from the supplied text. Returns a slice of TwitterEntity struct pointers.

The ScreenName field in the returned structs will contain the value of the referenced username without the leading @ sign.

The ListSlug field in the returned structs will contain the name of the list (if present), without the leading / or preceding username

func ExtractReplyScreenname

func ExtractReplyScreenname(text string) *TwitterEntity

Extracts an @username mention from the beginning of the supplied text. A reply is defined as the occurrence of a mention (@username) at the beginning of a tweet preceded by 0 or more spaces.

Returns a pointer to a TwitterEntity struct

func ExtractUrls

func ExtractUrls(text string) []*TwitterEntity

Extract urls from the given text. Returns a slice of TwitterEntity struct pointers.

func (*TwitterEntity) Cashtag

func (t *TwitterEntity) Cashtag() (string, bool)

Returns the value of the extracted cashtag (when Type=CASH_TAG) and a boolean indicating whether the value is set. The return value will be ("", false) when Type != CASH_TAG

func (*TwitterEntity) Hashtag

func (t *TwitterEntity) Hashtag() (string, bool)

Returns the value of the extracted hashtag (when Type=HASH_TAG) and a boolean indicating whether the value is set. The return value will be ("", false) when Type != HASH_TAG

func (*TwitterEntity) ListSlug

func (t *TwitterEntity) ListSlug() (string, bool)

Returns the value of the extracted list name (when Type=MENTION) and a boolean indicating whether the value is set. The return value will be ("", false) when Type != MENTION OR when the extracted entity is simply a mention and not a list name

func (*TwitterEntity) ScreenName

func (t *TwitterEntity) ScreenName() (string, bool)

Returns the value of the extracted screen name (when Type=MENTION) and a boolean indicating whether the value is set. The return value will be ("", false) when Type != MENTION

func (*TwitterEntity) String

func (t *TwitterEntity) String() string

Implement the Stringer interface

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL