pipeline

package

v0.0.0-...-e4649f2 Latest Latest Go to latest Published: May 6, 2022 License: MIT Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/coderafting/sentiment-analysis

Links

Open Source Insights

Documentation ¶

Overview ¶

Package pipeline provides implementation for a multi-stage sentiment analysis pipeline. The analysis is based on the PANAS-t paper, using Twitter data. It also provides a mechanism to perform parallel processing in the pipeline, and a mechanism to load-balance while achieving parallelism.

Index ¶

func ComputeAndSave(in []chan TweetText, db database.DataStore)
func ComputeSentimentAndSave(in chan TweetText, db database.DataStore)
func ConsumeVTPubPT(in []chan TweetText, out []chan TweetText, indexRR *MemRR)
func MemPartitions(num int, buffer int) []chan TweetText
func NextIndex(m *MemRR, max int) int
func PubProcessedText(in chan TweetText, out []chan TweetText, indexRR *MemRR)
func PubValidText(vt TweetText, chansArray []chan TweetText, indexRR *MemRR)
type MemRR
type TweetText

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ComputeAndSave ¶

func ComputeAndSave(in []chan TweetText, db database.DataStore)

ComputeAndSave triggers goroutines, each of which starts consuming a specific channel, and perform computeSentimentAndSave operation on DB.

func ComputeSentimentAndSave ¶

func ComputeSentimentAndSave(in chan TweetText, db database.DataStore)

ComputeSentimentAndSave consumes processed TweetText from a channel, saves the text to DB, and updates the corresponding sentiments.

func ConsumeVTPubPT ¶

func ConsumeVTPubPT(in []chan TweetText, out []chan TweetText, indexRR *MemRR)

ConsumeVTPubPT triggers goroutines, each of which starts consuming a specific channel, and produce the processed data to a channel from a collection of channels.

func MemPartitions ¶

func MemPartitions(num int, buffer int) []chan TweetText

MemPartitions returns a slice of buffered channels, based on the partition count and buffer value.

func NextIndex ¶

func NextIndex(m *MemRR, max int) int

NextIndex consumes the in-memory RoundRobin object, increases its Index value by 1, if it is less than the specified max value, and returns the Index value. If the Index reaches the max, it resets it to 0, and returns the Index.

func PubProcessedText ¶

func PubProcessedText(in chan TweetText, out []chan TweetText, indexRR *MemRR)

PubProcessedText consumes from a channel of TweetText, process the data, and publishes to an appropriate channel from a collection of channels. The selection of a channel happens via round-robin mechanism.

func PubValidText ¶

func PubValidText(vt TweetText, chansArray []chan TweetText, indexRR *MemRR)

PubValidText publishes valid TweetText data to an appropriate partition (channel) from a collection of channels. The selection of a channel happens via round-robin mechanism.

Types ¶

type MemRR ¶

type MemRR struct {
	Index int
	// contains filtered or unexported fields
}

MemRR represents an in-memory RoundRobin object.

type TweetText ¶

type TweetText struct {
	TextString        string
	SentimentCategory string
}

TweetText represents a valid text string that can be considered for sentiment analysis, based on the criteria set by the PANAS-t paper.

Source Files ¶

View all Source files

pipeline.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL