pipeline

package
v0.0.0-...-e4649f2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 6, 2022 License: MIT Imports: 4 Imported by: 0

Documentation

Overview

Package pipeline provides implementation for a multi-stage sentiment analysis pipeline. The analysis is based on the PANAS-t paper, using Twitter data. It also provides a mechanism to perform parallel processing in the pipeline, and a mechanism to load-balance while achieving parallelism.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ComputeAndSave

func ComputeAndSave(in []chan TweetText, db database.DataStore)

ComputeAndSave triggers goroutines, each of which starts consuming a specific channel, and perform computeSentimentAndSave operation on DB.

func ComputeSentimentAndSave

func ComputeSentimentAndSave(in chan TweetText, db database.DataStore)

ComputeSentimentAndSave consumes processed TweetText from a channel, saves the text to DB, and updates the corresponding sentiments.

func ConsumeVTPubPT

func ConsumeVTPubPT(in []chan TweetText, out []chan TweetText, indexRR *MemRR)

ConsumeVTPubPT triggers goroutines, each of which starts consuming a specific channel, and produce the processed data to a channel from a collection of channels.

func MemPartitions

func MemPartitions(num int, buffer int) []chan TweetText

MemPartitions returns a slice of buffered channels, based on the partition count and buffer value.

func NextIndex

func NextIndex(m *MemRR, max int) int

NextIndex consumes the in-memory RoundRobin object, increases its Index value by 1, if it is less than the specified max value, and returns the Index value. If the Index reaches the max, it resets it to 0, and returns the Index.

func PubProcessedText

func PubProcessedText(in chan TweetText, out []chan TweetText, indexRR *MemRR)

PubProcessedText consumes from a channel of TweetText, process the data, and publishes to an appropriate channel from a collection of channels. The selection of a channel happens via round-robin mechanism.

func PubValidText

func PubValidText(vt TweetText, chansArray []chan TweetText, indexRR *MemRR)

PubValidText publishes valid TweetText data to an appropriate partition (channel) from a collection of channels. The selection of a channel happens via round-robin mechanism.

Types

type MemRR

type MemRR struct {
	Index int
	// contains filtered or unexported fields
}

MemRR represents an in-memory RoundRobin object.

type TweetText

type TweetText struct {
	TextString        string
	SentimentCategory string
}

TweetText represents a valid text string that can be considered for sentiment analysis, based on the criteria set by the PANAS-t paper.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL