feed2json

package module
v0.20.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 9, 2020 License: MIT Imports: 11 Imported by: 2

README

feed2json GoDoc Go Report Card Calver v0.YY.Minor

Given an Atom or RSS feed, creates a comparable JSON feed.

See demo webapp on Heroku (sample conversion).

Installation

First install Go.

If you just want to install the binary to your current directory and don't care about the source code, run

GOBIN="$(pwd)" GOPATH="$(mktemp -d)" go get github.com/carlmjohnson/feed2json/...

Screenshots

$ feed2json -h
feed2json converts an Atom or RSS feed into a JSON feed.

    feed2json [opts]

Options:
  -dst file
        destination file (default stdout)
  -src file or URL
        source file or URL (default stdin)
  -timeout duration
        timeout for URL sources (default 5s)

$ feed2json -src 'https://jsonfeed.org/xml/rss.xml' | json-tidy
{
    "description": "JSON Feed is a pragmatic syndication format for blogs, microblogs, and other time-based content.",
    "home_page_url": "https://jsonfeed.org/",
    "items": [
        {
            "content_html": "<p>We — Manton Reece and Brent Simmons — have noticed that JSON has become the developers’ choice for APIs, and that developers will often go out of their way to avoid XML. JSON is simpler to read and write, and it’s less prone to bugs.</p>\n\n<p>So we developed JSON Feed, a format similar to <a href=\"http://cyber.harvard.edu/rss/rss.html\">RSS</a> and <a href=\"https://tools.ietf.org/html/rfc4287\">Atom</a> but in JSON. It reflects the lessons learned from our years of work reading and publishing feeds.</p>\n\n<p><a href=\"https://jsonfeed.org/version/1\">See the spec</a>. It’s at version 1, which may be the only version ever needed. If future versions are needed, version 1 feeds will still be valid feeds.</p>\n\n<h4>Notes</h4>\n\n<p>We have a <a href=\"https://github.com/manton/jsonfeed-wp\">WordPress plugin</a> and, coming soon, a JSON Feed Parser for Swift. As more code is written, by us and others, we’ll update the <a href=\"https://jsonfeed.org/code\">code</a> page.</p>\n\n<p>See <a href=\"https://jsonfeed.org/mappingrssandatom\">Mapping RSS and Atom to JSON Feed</a> for more on the similarities between the formats.</p>\n\n<p>This website — the Markdown files and supporting resources — <a href=\"https://github.com/brentsimmons/JSONFeed\">is up on GitHub</a>, and you’re welcome to comment there.</p>\n\n<p>This website is also a blog, and you can subscribe to the <a href=\"https://jsonfeed.org/xml/rss.xml\">RSS feed</a> or the <a href=\"https://jsonfeed.org/feed.json\">JSON feed</a> (if your reader supports it).</p>\n\n<p>We worked with a number of people on this over the course of several months. We list them, and thank them, at the bottom of the <a href=\"https://jsonfeed.org/version/1\">spec</a>. But — most importantly — <a href=\"http://furbo.org/\">Craig Hockenberry</a> spent a little time making it look pretty. :)</p>",
            "date_published": "2017-05-17T15:02:12Z",
            "id": "https://jsonfeed.org/2017/05/17/announcing_json_feed",
            "title": "Announcing JSON Feed",
            "url": "https://jsonfeed.org/2017/05/17/announcing_json_feed"
        }
    ],
    "title": "JSON Feed",
    "version": "https://jsonfeed.org/version/1"
}

$ feed2jsonweb -h
feed2jsonweb is an HTTP server that converts Atom and RSS feeds to JSON feeds

Usage:

    feed2jsonweb [opts]


Options:

  -allow-host host
        require requested URLs to be on host
  -cors-origin value
        allow these CORS origins (default *)
  -host name
        host name to listen for (default "127.0.0.1")
  -max-age duration
        set Cache-Control: public, max-age header (default 5m0s)
  -param string
        expect URL in this query param (default "url")
  -port number
        port number to listen on (default "8080")
  -read-timeout duration
        timeout for reading request headers (default 1s)
  -request-timeout duration
        timeout for fetching XML (default 1s)
  -url-path string
        serve requests on this path (default "/")
  -write-timeout duration
        timeout for writing response (default 2s)

Note: -allow-host and -cors-origin can be passed multiple times to set more hosts and origins. Options can also be passed as environmental variables (CAPITALIZED_WITH_UNDERSCORES).

Documentation

Overview

Package feed2json converts Atom and RSS feeds to JSON feeds.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Convert

func Convert(from, to *bytes.Buffer) (err error)

Convert takes an XML feed from one buffer and turns it into a JSON feed in the other buffer.

Example
package main

import (
	"bytes"
	"encoding/json"
	"fmt"

	"github.com/carlmjohnson/feed2json"
)

func main() {
	var from, to bytes.Buffer
	from.WriteString(`
<?xml version="1.0"?>
<rss version="2.0">
   <channel>
      <title>Liftoff News</title>
      <link>http://liftoff.msfc.nasa.gov/</link>
      <description>Liftoff to Space Exploration.</description>
      <language>en-us</language>
      <pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
      <lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate>
      <docs>http://blogs.law.harvard.edu/tech/rss</docs>
      <generator>Weblog Editor 2.0</generator>
      <managingEditor>editor@example.com</managingEditor>
      <webMaster>webmaster@example.com</webMaster>
      <item>
         <title>Star City</title>
         <link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link>
         <description>How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture, language and protocol at Russia's &lt;a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm"&gt;Star City&lt;/a&gt;.</description>
         <pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
         <guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
      </item>
   </channel>
</rss>
    `)
	if err := feed2json.Convert(&from, &to); err == nil {
		from.Reset()
		json.Indent(&from, to.Bytes(), "", "  ")
		fmt.Println(from.String())
	}
}
Output:

{
  "version": "https://jsonfeed.org/version/1",
  "title": "Liftoff News",
  "home_page_url": "http://liftoff.msfc.nasa.gov/",
  "description": "Liftoff to Space Exploration.",
  "author": {},
  "items": [
    {
      "id": "http://liftoff.msfc.nasa.gov/2003/06/03.html#item573",
      "url": "http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp",
      "title": "Star City",
      "content_html": "How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture, language and protocol at Russia's <a href=\"http://howe.iki.rssi.ru/GCTC/gctc_e.htm\">Star City</a>.",
      "date_published": "2003-06-03T09:39:21Z"
    }
  ]
}

func FeedURLFromContext added in v0.0.4

func FeedURLFromContext(ctx context.Context) (u *url.URL, valid bool)

FeedURLFromContext allows middleware to intercept the URLs and their validity in Handler.

func Handler

func Handler(x URLExtractor, v URLValidator, c *http.Client, l Logger, ms ...Middleware) http.Handler

Handler is an http.Handler that extracts and validates a URL for a request, sets the URL and its validity with SetFeedURLContext. Responses from Handler are then wrapped by the user provided middleware, if any. Finally, the innermost handler requests valid URLs with the provided http.Client by unwrapping FeedURLFromContext.

c if nil defaults to http.DefaultClient. l if nil defaults to log.Printf.

func SetFeedURLContext added in v0.0.4

func SetFeedURLContext(ctx context.Context, u *url.URL, valid bool) context.Context

SetFeedURLContext allows middleware to intercept Handler calls and change the feed URL or its validity.

Types

type Logger

type Logger = func(format string, v ...interface{})

Logger is a user provided callback that matches the fmt/log.Printf calling conventions.

type Middleware

type Middleware = func(http.Handler) http.Handler

Middleware wraps an http.Handler in a http.Handler.

type URLExtractor

type URLExtractor = func(*http.Request) *url.URL

URLExtractor is a user provided callback that determines a URL for an XML feed based on a request

func ExtractURLFromParam

func ExtractURLFromParam(name string) URLExtractor

ExtractURLFromParam is a URLExtractor that extracts a URL from the query param specified by name.

func StaticURLInjector added in v0.0.6

func StaticURLInjector(staticurl string) URLExtractor

StaticURLInjector is a URLExtractor that always injects the same URL, the provided string.

type URLValidator

type URLValidator = func(*url.URL) bool

URLValidator is a user provided callback that determines whether the URL for an XML feed is valid for Handler.

func ValidateHost

func ValidateHost(names ...string) URLValidator

ValidateHost is a URLValidator that approves of URLs where the hostname is in the names list.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL