Documentation ¶
Overview ¶
Package jsonextract implements functions for finding and extracting any valid JavaScript object (not just JSON) from an io.Reader.
This is an example of valid input for this package:
<script> var x = { // Keys without quotes are valid in JavaScript, but not in JSON key: "value", num: 295.2, // Comments are removed while processing // Mixing normal and quoted keys is possible "obj": { "quoted": 325, 'other quotes': true, unquoted: 'test', // This trailing comma will be removed }, // JSON doesn't support all these number formats "dec": +21, "hex": 0x15, "oct": 0o25, "bin": 0b10101, bigint: 21n, // NaN will be converted to null. Infinity values are however not supported "num2": NaN, // Undefined will be interpreted as null "udef": undefined, `lastvalue`: `multiline strings are no problem` } </script>
The input will be searched for anything that looks like JavaScript notation. Found objects and arrays are converted to JSON, which can then be used for decoding into Go structures.
Objects is a high-level function for easily extracting certain objects no matter their position within any other object. Reader is a lower-level function that gives you more control over how you process objects and arrays.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ErrCallbackNeverCalled = errors.New("callback never called")
ErrCallbackNeverCalled is returned from the Objects method if the callback of a required ObjectOption was never satisfied, which means that the callback never returned ErrStop.
var ErrStop = errors.New("stop processing json")
ErrStop can be returned from a JSONCallback function to indicate that processing should stop. When used with Reader, it will stop processing. When used with Objects, the callback function will never be called again (e.g. after it received the required data).
Functions ¶
func Objects ¶ added in v1.4.0
func Objects(r io.Reader, o []ObjectOption) (err error)
Objects extracts all nested objects and passes them to appropriate callback functions. You can define which keys must be present for an object to be passed to your function.
This method will check not just top-level object keys, but also those of all child objects.
If multiple options would match, only the first one will be processed. This allows you to cascade options to first extract objects with the most keys, then those with less (which is useful if there are overlapping keys).
If a required option is not matched, ErrCallbackNeverCalled will be returned.
Arrays/Slices will not cause a callback as they don't have keys, but objects in them will be matched.
Example (MultipleList) ¶
This example shows how to extract both a single object and a list of other objects.
// Define all structs we need for extraction type ytVideo struct { VideoID string `json:"videoId"` Title struct { Runs []struct { Text string `json:"text"` } `json:"runs"` } `json:"title"` } type ytPlaylist struct { URLCanonical string `json:"urlCanonical"` Title string `json:"title"` } // This is where our data should end up var ( videoList []ytVideo playlist ytPlaylist ) // This file contains the HTML response of a YouTube playlist. // One could also extract directly from a response body f, err := os.Open("testdata/playlist.html") if err != nil { panic(err) } defer f.Close() err = Objects(f, []ObjectOption{ { // All videos have an "videoId" and "title" key Keys: []string{"videoId", "title"}, // We use a more specialized callback to append to videoList Callback: func(b []byte) error { var vid ytVideo // Decode the given object. It has at least the Keys defined above err := json.Unmarshal(b, &vid) if err != nil { // if that didn't work, we skip the object return nil } // Check if anything required is missing if len(vid.Title.Runs) == 0 || vid.VideoID == "" { return nil } // Seems like we got the info we wanted, we can now store it videoList = append(videoList, vid) // ... and continue with the next object return nil }, }, { // Here we want to extract a playlist info object Keys: []string{"title", "urlCanonical"}, Callback: Unmarshal(&playlist, func() bool { return playlist.Title != "" && playlist.URLCanonical != "" }), }, }) if err != nil { panic(err) } fmt.Printf("The %q playlist has %d videos\n", playlist.Title, len(videoList))
Output: The "Starship" playlist has 10 videos
Example (NestedObjects) ¶
This example shows how to extract nested objects.
// Test input var input = strings.NewReader(` <script> var x = { "id": 339750489, // This comment makes the input invalid JSON "node_id": "MDEwOlJlcG9zaXRvcnkzMzk3NTA0ODk=", "name": "jsonextract", "full_name": "xarantolus/jsonextract", "private": false, "owner": { "login": "xarantolus", "id": 32465636, "node_id": "MDQ6VXNlcjMyNDY1NjM2", "avatar_url": "https://avatars.githubusercontent.com/u/32465636?v=4", "gravatar_id": "", "html_url": "https://github.com/xarantolus", "type": "User", "site_admin": false }, "html_url": "https://github.com/xarantolus/jsonextract", "description": "Go package for finding and extracting any valid JavaScript object (not just JSON) from an io.Reader", "open_issues_count": 0, "license": { "key": "mit", "name": "MIT License", "spdx_id": "MIT", "url": "https://api.github.com/licenses/mit", "node_id": "MDc6TGljZW5zZTEz" }, } </script>`) // The "license" object has this structure type License struct { Key string `json:"key"` Name string `json:"name"` SpdxID string `json:"spdx_id"` URL string `json:"url"` NodeID string `json:"node_id"` } // ... and the "owner" object has this one type Owner struct { Login string `json:"login"` ID int `json:"id"` NodeID string `json:"node_id"` AvatarURL string `json:"avatar_url"` GravatarID string `json:"gravatar_id"` HTMLURL string `json:"html_url"` Type string `json:"type"` SiteAdmin bool `json:"site_admin"` } // We want to extract these two different objects that are nested within // the whole JSON-like structure var ( license License owner Owner ) // Use Objects to extract all objects and match them to their keys err := Objects(input, []ObjectOption{ { // A valid license object has these keys Keys: []string{"key", "name", "spdx_id", "node_id"}, // Unmarshal decodes the object to license until the function verifies that correct data was found // If there were multiple objects matching the keys, one could select the one that is wanted Callback: Unmarshal(&license, func() bool { // Return true if all fields we want have valid values return license.Key != "" && license.Name != "" }), // If this value is not present in the JSON data, the Objects call will return an error Required: true, }, { // The owner object mostly has different keys, the overlap with "node_id" // doesn't matter because all listed keys must be present anyways Keys: []string{"login", "id", "html_url", "node_id"}, Callback: Unmarshal(&owner, func() bool { return owner.Login != "" && owner.HTMLURL != "" }), Required: true, }, }) if err != nil { panic(err) } fmt.Printf("%s has published their package under the %s\n", owner.Login, license.Name)
Output: xarantolus has published their package under the MIT License
func Reader ¶
func Reader(reader io.Reader, callback JSONCallback) (err error)
Reader reads all JSON and JavaScript objects from the input and calls callback for each of them.
Errors returned from the callback will stop the method. The error will be returned, except if it is ErrStop which will cause the method to return nil.
Please note that the reader must return UTF-8 bytes for this to work correctly.
Types ¶
type JSONCallback ¶
JSONCallback is the callback function passed to Reader and ObjectOptions.
Any JSON objects will be passed to it as bytes as defined by the function.
If this function returns an error, processing will stop and return that error. You can return ErrStop to make sure the function will not be called again.
func Unmarshal ¶ added in v1.4.0
func Unmarshal(pointer interface{}, verify func() bool) JSONCallback
Unmarshal returns a callback function that can be used with the Objects method for decoding one element. After verify returns true, the object will no longer be changed.
Please note that any Unmarshal errors will be ignored, which means that if you don't pass a pointer or your struct field types don't match the ones in the data, you will not be notified about the error.
type ObjectOption ¶ added in v1.4.0
type ObjectOption struct { // Keys defines a filter for objects. Only objects where these keys are present will be passed to Callback. // If this is not set, all objects will be passed to the callback. Keys []string // Callback receives JSON bytes for all objects that have all keys defined by Keys. // Returning ErrStop will stop extraction without error. Other errors will be returned. Callback JSONCallback // Required sets whether ErrCallbackNeverCalled should be returned if the callback function for this ObjectOption is not called Required bool }
ObjectOption defines filters and callbacks for the Object method