align

package module
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 15, 2020 License: Apache-2.0 Imports: 16 Imported by: 0

README

= otf-classifier
Web service to align free text to the learning progressions text as document classification

NOTE: This is experimental and proof-of-concept code

NOTE: Project has moved to go modules, remember to export GO111MODULE=on for go get and go build if you are working without module support as default.

This code is based on https://github.com/nsip/curriculum-align[]. It puts in place a document classifier
(https://en.wikipedia.org/wiki/Tf–idf[]) to classify arbitrary text as aligning to the learning progressions
the code is provisioned with, and outputting the alignments as a web service.

The code is set up to align to indicators; however training corpora are built up against both indicator codes and development level codes, and the latter can be used instead. You can switch by setting the variable `granularity` in classifier.go to be `"Devlevel"` instead of `"Indicator"`.

Binary distributions of the code are available in the `build/` directory.

The web service is made available as a library (`Align()`); the `cmd` directory contains a sample shell for it, which is used in the binary distribution. In the sample shell, the web service runs on port 1576. The test script `test.sh` issues representative REST queries against the web service.

The web service takes the following arguments:

[source,console]
----
GET http://localhost:1576/align?area=W&text=....
----

where _area_ is the learning area (`Numeracy` or `Literacy`), and _text_ is the text to be aligned. Both the _text_ and the _area_ parameters are obligatory. 

For example:

[source,console]
----
http://localhost:1576/align?area=Numeracy&text=information
----

For larger payloads or automated environments calling the classifier, an equivalent POST method is also avialable which will accept a JSON payload:
[source,console]
-----
curl http://localhost:1576/align -H 'Content-Type: application/json' -d'{"area":"literacy","text":"confident sentences"}'
-----

The response is a JSON list of structs, one for each curriculum standard that the service is configured for, with the following fields:

* Item: the identifier of the curriculum item (indicator) whose alignment is reported
* DevLevel: the identifier of the development level corresponding to the curriculum item (indicator)
* Path: the path down to the indicator or development level
* Text: the text of the curriculum item whose alignment is reported
* Score: the score of the alignment. This is the score generated by github.com/jbrukh/bayesian: it is a negative number, and the higher the number (i.e. the closer to zero), the better the alignment of the text to the curriculum standard.
* Matches: the top five words that were the basis for the alignment of the curriculum item to the text
** Text: the matching word
** Score: the logarithmic score

For example:

[source,console]
----
[
  {
    "Item": "uri/version/00b902b5-7065-430f-b6de-f9b92aac85ff",
    "Text": "identifies symmetry in the environment",
    "DevLevel": "UGP3",
    "Path": [
      {
        "Key": "General Capability",
        "Val": "Numeracy"
      },
      {
        "Key": "Element",
        "Val": "Measurement and geometry"
      },
      {
        "Key": "Sub-element",
        "Val": "Understanding geometric properties"
      },
      {
        "Key": "Progression Level",
        "Val": "UGP3"
      },
      {
        "Key": "Heading",
        "Val": "Transformations"
      },
      {
        "Key": "Indicator",
        "Val": "identifies and creates patterns involving one- and two-step transformations of shapes (e.g. uses pattern blocks to create a pattern and describes how the pattern was created)"
      }
    ],
    "Score": -58.50905189596907,
    "Matches": [
      {
        "Word": "collects",
        "Score": -25.328436022934504
      },
      {
        "Word": "information",
        "Score": -25.328436022934504
      }
    ]
  }
]
----


The documents passed to the document classifier are also indexed, and can be queried through
the web service `index`:

[source,console]
----
GET http://localhost:1576/index?search=word
----

The `Path` lookup of an indicator or progression level code or URI can be queried through the web service
`lookup`:

[source,console]
----
GET http://localhost:1576/lookup?search=UGP3
----

To use embedded in other labstack.echo webservers, replicate the `cmd/main.go` main() code:

[source,console]
----
align.Init()
e := echo.New()
e.GET("/align", align.Align)
e.GET("/lookup", func(c echo.Context) error {
                query := c.QueryParam("search")
                ret, err := align.Lookup(query)
                if err != nil {
                        return err
                } else {
                        return c.String(http.StatusOK, string(ret))
                }
        })
e.GET("/index", func(c echo.Context) error {
                query := c.QueryParam("search")
                ret, err := align.Search(query)
                if err != nil {
                        return err
                } else {
                        return c.String(http.StatusOK, string(ret))
                }
        })
----

The web service is configured to read any JSON files in the `curricula` folder of the executable; the file included in the distribution is a mockup of the proposed machine encoding of the National Learning Progressions.

== 1576

https://en.wikipedia.org/wiki/Curriculum[]:

> The word "curriculum" began as a Latin word which means "a race" or "the course of a race" (which in turn derives from the verb _currere_ meaning "to run/to proceed"). The first known use in an educational context is in the https://books.google.com.au/books?id=bG5EAAAAcAAJ&printsec=frontcover&hl=el&source=gbs_ge_summary_r&cad=0#v=onepage&q=curriculum&f=false[_Professio Regia_], a work by University of Paris professor https://en.wikipedia.org/wiki/Petrus_Ramus[Petrus Ramus] published https://en.wikipedia.org/wiki/St._Bartholomew%27s_Day_massacre[posthumously] in 1576.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Align

func Align(c echo.Context) error

func Init

func Init()

func InitTokeniser

func InitTokeniser() error

func Keys added in v0.1.1

func Keys(m map[string]*CurricContent) (keys []string)

func Lookup added in v0.1.1

func Lookup(query string) (interface{}, error)
func Search(query string) ([]byte, error)

simple Bleve search. Return JSON of search results

func Tokenise

func Tokenise(id string, txt string, record interface{}) []string

both tokenise a string and index a record (which contains the string) if id is not empty, index the string under id as an identifier

Types

type AlignmentQuery added in v0.1.1

type AlignmentQuery struct {
	Area string `json:"area" form:"area" query:"area"`
	Text string `json:"text" form:"text" query:"text"`
}

query params for classifier alignment supports query-string, form and json payload inputs

Area: LP Capabilty, currently Literacy or Numeracy Text: the input to send to the classifier, such as observation or question text

type AlignmentType

type AlignmentType struct {
	Item     string
	Text     string
	DevLevel string
	Path     []*Keyval
	Score    float64
	Matches  []bayesian.MatchStruct
}

type ClassifierType

type ClassifierType struct {
	Classifier *bayesian.Classifier
	Classes    []bayesian.Class
}

type CurricContent

type CurricContent struct {
	Text     []string
	DevLevel string
	Path     []*Keyval
}

type Curriculum

type Curriculum = map[string]map[string]map[string]*CurricContent

type Keyval

type Keyval struct {
	Key string
	Val string
}

Directories

Path Synopsis
A Naive Bayesian Classifier Jake Brukhman <jbrukh@gmail.com> BAYESIAN CLASSIFICATION REFRESHER: suppose you have a set of classes (e.g.
A Naive Bayesian Classifier Jake Brukhman <jbrukh@gmail.com> BAYESIAN CLASSIFICATION REFRESHER: suppose you have a set of classes (e.g.
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL