align

package module

v0.1.2 Latest Latest Go to latest Published: Jul 15, 2020 License: Apache-2.0 Imports: 16 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/nsip/otf-classifier

Links

Open Source Insights

README ¶

= otf-classifier
Web service to align free text to the learning progressions text as document classification

NOTE: This is experimental and proof-of-concept code

NOTE: Project has moved to go modules, remember to export GO111MODULE=on for go get and go build if you are working without module support as default.

This code is based on https://github.com/nsip/curriculum-align[]. It puts in place a document classifier
(https://en.wikipedia.org/wiki/Tf–idf[]) to classify arbitrary text as aligning to the learning progressions
the code is provisioned with, and outputting the alignments as a web service.

The code is set up to align to indicators; however training corpora are built up against both indicator codes and development level codes, and the latter can be used instead. You can switch by setting the variable `granularity` in classifier.go to be `"Devlevel"` instead of `"Indicator"`.

Binary distributions of the code are available in the `build/` directory.

The web service is made available as a library (`Align()`); the `cmd` directory contains a sample shell for it, which is used in the binary distribution. In the sample shell, the web service runs on port 1576. The test script `test.sh` issues representative REST queries against the web service.

The web service takes the following arguments:

[source,console]
----
GET http://localhost:1576/align?area=W&text=....
----

where _area_ is the learning area (`Numeracy` or `Literacy`), and _text_ is the text to be aligned. Both the _text_ and the _area_ parameters are obligatory. 

For example:

[source,console]
----
http://localhost:1576/align?area=Numeracy&text=information
----

For larger payloads or automated environments calling the classifier, an equivalent POST method is also avialable which will accept a JSON payload:
[source,console]
-----
curl http://localhost:1576/align -H 'Content-Type: application/json' -d'{"area":"literacy","text":"confident sentences"}'
-----

The response is a JSON list of structs, one for each curriculum standard that the service is configured for, with the following fields:

* Item: the identifier of the curriculum item (indicator) whose alignment is reported
* DevLevel: the identifier of the development level corresponding to the curriculum item (indicator)
* Path: the path down to the indicator or development level
* Text: the text of the curriculum item whose alignment is reported
* Score: the score of the alignment. This is the score generated by github.com/jbrukh/bayesian: it is a negative number, and the higher the number (i.e. the closer to zero), the better the alignment of the text to the curriculum standard.
* Matches: the top five words that were the basis for the alignment of the curriculum item to the text
** Text: the matching word
** Score: the logarithmic score

For example:

[source,console]
----
[
  {
    "Item": "uri/version/00b902b5-7065-430f-b6de-f9b92aac85ff",
    "Text": "identifies symmetry in the environment",
    "DevLevel": "UGP3",
    "Path": [
      {
        "Key": "General Capability",
        "Val": "Numeracy"
      },
      {
        "Key": "Element",
        "Val": "Measurement and geometry"
      },
      {
        "Key": "Sub-element",
        "Val": "Understanding geometric properties"
      },
      {
        "Key": "Progression Level",
        "Val": "UGP3"
      },
      {
        "Key": "Heading",
        "Val": "Transformations"
      },
      {
        "Key": "Indicator",
        "Val": "identifies and creates patterns involving one- and two-step transformations of shapes (e.g. uses pattern blocks to create a pattern and describes how the pattern was created)"
      }
    ],
    "Score": -58.50905189596907,
    "Matches": [
      {
        "Word": "collects",
        "Score": -25.328436022934504
      },
      {
        "Word": "information",
        "Score": -25.328436022934504
      }
    ]
  }
]
----


The documents passed to the document classifier are also indexed, and can be queried through
the web service `index`:

[source,console]
----
GET http://localhost:1576/index?search=word
----

The `Path` lookup of an indicator or progression level code or URI can be queried through the web service
`lookup`:

[source,console]
----
GET http://localhost:1576/lookup?search=UGP3
----

To use embedded in other labstack.echo webservers, replicate the `cmd/main.go` main() code:

[source,console]
----
align.Init()
e := echo.New()
e.GET("/align", align.Align)
e.GET("/lookup", func(c echo.Context) error {
                query := c.QueryParam("search")
                ret, err := align.Lookup(query)
                if err != nil {
                        return err
                } else {
                        return c.String(http.StatusOK, string(ret))
                }
        })
e.GET("/index", func(c echo.Context) error {
                query := c.QueryParam("search")
                ret, err := align.Search(query)
                if err != nil {
                        return err
                } else {
                        return c.String(http.StatusOK, string(ret))
                }
        })
----

The web service is configured to read any JSON files in the `curricula` folder of the executable; the file included in the distribution is a mockup of the proposed machine encoding of the National Learning Progressions.

== 1576

https://en.wikipedia.org/wiki/Curriculum[]:

> The word "curriculum" began as a Latin word which means "a race" or "the course of a race" (which in turn derives from the verb _currere_ meaning "to run/to proceed"). The first known use in an educational context is in the https://books.google.com.au/books?id=bG5EAAAAcAAJ&printsec=frontcover&hl=el&source=gbs_ge_summary_r&cad=0#v=onepage&q=curriculum&f=false[_Professio Regia_], a work by University of Paris professor https://en.wikipedia.org/wiki/Petrus_Ramus[Petrus Ramus] published https://en.wikipedia.org/wiki/St._Bartholomew%27s_Day_massacre[posthumously] in 1576.

Documentation ¶

Index ¶

func Align(c echo.Context) error
func Init()
func InitTokeniser() error
func Keys(m map[string]*CurricContent) (keys []string)
func Lookup(query string) (interface{}, error)
func Search(query string) ([]byte, error)
func Tokenise(id string, txt string, record interface{}) []string
type AlignmentQuery
type AlignmentType
type ClassifierType
type CurricContent
type Curriculum
type Keyval

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Align ¶

func Align(c echo.Context) error

func Init ¶

func Init()

func InitTokeniser ¶

func InitTokeniser() error

func Keys ¶ added in v0.1.1

func Keys(m map[string]*CurricContent) (keys []string)

func Lookup ¶ added in v0.1.1

func Lookup(query string) (interface{}, error)

func Search ¶

func Search(query string) ([]byte, error)

simple Bleve search. Return JSON of search results

func Tokenise ¶

func Tokenise(id string, txt string, record interface{}) []string

both tokenise a string and index a record (which contains the string) if id is not empty, index the string under id as an identifier

Types ¶

type AlignmentQuery ¶ added in v0.1.1

type AlignmentQuery struct {
	Area string `json:"area" form:"area" query:"area"`
	Text string `json:"text" form:"text" query:"text"`
}

query params for classifier alignment supports query-string, form and json payload inputs

Area: LP Capabilty, currently Literacy or Numeracy Text: the input to send to the classifier, such as observation or question text

type AlignmentType ¶

type AlignmentType struct {
	Item     string
	Text     string
	DevLevel string
	Path     []*Keyval
	Score    float64
	Matches  []bayesian.MatchStruct
}

type ClassifierType ¶

type ClassifierType struct {
	Classifier *bayesian.Classifier
	Classes    []bayesian.Class
}

type CurricContent ¶

type CurricContent struct {
	Text     []string
	DevLevel string
	Path     []*Keyval
}

type Curriculum ¶

type Curriculum = map[string]map[string]map[string]*CurricContent

type Keyval ¶

type Keyval struct {
	Key string
	Val string
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
bayesian A Naive Bayesian Classifier Jake Brukhman <jbrukh@gmail.com> BAYESIAN CLASSIFICATION REFRESHER: suppose you have a set of classes (e.g.	A Naive Bayesian Classifier Jake Brukhman <jbrukh@gmail.com> BAYESIAN CLASSIFICATION REFRESHER: suppose you have a set of classes (e.g.
cmd
otf-classifier

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL