bluemonday

package module
v1.0.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 5, 2021 License: BSD-3-Clause Imports: 9 Imported by: 1,964

README

bluemonday Build Status GoDoc Sourcegraph

bluemonday is a HTML sanitizer implemented in Go. It is fast and highly configurable.

bluemonday takes untrusted user generated content as an input, and will return HTML that has been sanitised against a whitelist of approved HTML elements and attributes so that you can safely include the content in your web page.

If you accept user generated content, and your server uses Go, you need bluemonday.

The default policy for user generated content (bluemonday.UGCPolicy().Sanitize()) turns this:

Hello <STYLE>.XSS{background-image:url("javascript:alert('XSS')");}</STYLE><A CLASS=XSS></A>World

Into a harmless:

Hello World

And it turns this:

<a href="javascript:alert('XSS1')" onmouseover="alert('XSS2')">XSS<a>

Into this:

XSS

Whilst still allowing this:

<a href="http://www.google.com/">
  <img src="https://ssl.gstatic.com/accounts/ui/logo_2x.png"/>
</a>

To pass through mostly unaltered (it gained a rel="nofollow" which is a good thing for user generated content):

<a href="http://www.google.com/" rel="nofollow">
  <img src="https://ssl.gstatic.com/accounts/ui/logo_2x.png"/>
</a>

It protects sites from XSS attacks. There are many vectors for an XSS attack and the best way to mitigate the risk is to sanitize user input against a known safe list of HTML elements and attributes.

You should always run bluemonday after any other processing.

If you use blackfriday or Pandoc then bluemonday should be run after these steps. This ensures that no insecure HTML is introduced later in your process.

bluemonday is heavily inspired by both the OWASP Java HTML Sanitizer and the HTML Purifier.

Technical Summary

Whitelist based, you need to either build a policy describing the HTML elements and attributes to permit (and the regexp patterns of attributes), or use one of the supplied policies representing good defaults.

The policy containing the whitelist is applied using a fast non-validating, forward only, token-based parser implemented in the Go net/html library by the core Go team.

We expect to be supplied with well-formatted HTML (closing elements for every applicable open element, nested correctly) and so we do not focus on repairing badly nested or incomplete HTML. We focus on simply ensuring that whatever elements do exist are described in the policy whitelist and that attributes and links are safe for use on your web page. GIGO does apply and if you feed it bad HTML bluemonday is not tasked with figuring out how to make it good again.

Supported Go Versions

bluemonday is tested against Go 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 1.10, 1.11, 1.12, and tip.

We do not support Go 1.0 as we depend on golang.org/x/net/html which includes a reference to io.ErrNoProgress which did not exist in Go 1.0.

We support Go 1.1 but Travis no longer tests against it.

Is it production ready?

Yes

We are using bluemonday in production having migrated from the widely used and heavily field tested OWASP Java HTML Sanitizer.

We are passing our extensive test suite (including AntiSamy tests as well as tests for any issues raised). Check for any unresolved issues to see whether anything may be a blocker for you.

We invite pull requests and issues to help us ensure we are offering comprehensive protection against various attacks via user generated content.

Usage

Install in your ${GOPATH} using go get -u github.com/microcosm-cc/bluemonday

Then call it:

package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// Do this once for each unique policy, and use the policy for the life of the program
	// Policy creation/editing is not safe to use in multiple goroutines
	p := bluemonday.UGCPolicy()

	// The policy can then be used to sanitize lots of input and it is safe to use the policy in multiple goroutines
	html := p.Sanitize(
		`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`,
	)

	// Output:
	// <a href="http://www.google.com" rel="nofollow">Google</a>
	fmt.Println(html)
}

We offer three ways to call Sanitize:

p.Sanitize(string) string
p.SanitizeBytes([]byte) []byte
p.SanitizeReader(io.Reader) bytes.Buffer

If you are obsessed about performance, p.SanitizeReader(r).Bytes() will return a []byte without performing any unnecessary casting of the inputs or outputs. Though the difference is so negligible you should never need to care.

You can build your own policies:

package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	p := bluemonday.NewPolicy()

	// Require URLs to be parseable by net/url.Parse and either:
	//   mailto: http:// or https://
	p.AllowStandardURLs()

	// We only allow <p> and <a href="">
	p.AllowAttrs("href").OnElements("a")
	p.AllowElements("p")

	html := p.Sanitize(
		`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`,
	)

	// Output:
	// <a href="http://www.google.com">Google</a>
	fmt.Println(html)
}

We ship two default policies:

  1. bluemonday.StrictPolicy() which can be thought of as equivalent to stripping all HTML elements and their attributes as it has nothing on its whitelist. An example usage scenario would be blog post titles where HTML tags are not expected at all and if they are then the elements and the content of the elements should be stripped. This is a very strict policy.
  2. bluemonday.UGCPolicy() which allows a broad selection of HTML elements and attributes that are safe for user generated content. Note that this policy does not whitelist iframes, object, embed, styles, script, etc. An example usage scenario would be blog post bodies where a variety of formatting is expected along with the potential for TABLEs and IMGs.

Policy Building

The essence of building a policy is to determine which HTML elements and attributes are considered safe for your scenario. OWASP provide an XSS prevention cheat sheet to help explain the risks, but essentially:

  1. Avoid anything other than the standard HTML elements
  2. Avoid script, style, iframe, object, embed, base elements that allow code to be executed by the client or third party content to be included that can execute code
  3. Avoid anything other than plain HTML attributes with values matched to a regexp

Basically, you should be able to describe what HTML is fine for your scenario. If you do not have confidence that you can describe your policy please consider using one of the shipped policies such as bluemonday.UGCPolicy().

To create a new policy:

p := bluemonday.NewPolicy()

To add elements to a policy either add just the elements:

p.AllowElements("b", "strong")

Or using a regex:

Note: if an element is added by name as shown above, any matching regex will be ignored

It is also recommended to ensure multiple patterns don't overlap as order of execution is not guaranteed and can result in some rules being missed.

p.AllowElementsMatching(regex.MustCompile(`^my-element-`))

Or add elements as a virtue of adding an attribute:

// Not the recommended pattern, see the recommendation on using .Matching() below
p.AllowAttrs("nowrap").OnElements("td", "th")

Again, this also supports a regex pattern match alternative:

p.AllowAttrs("nowrap").OnElementsMatching(regex.MustCompile(`^my-element-`))

Attributes can either be added to all elements:

p.AllowAttrs("dir").Matching(regexp.MustCompile("(?i)rtl|ltr")).Globally()

Or attributes can be added to specific elements:

// Not the recommended pattern, see the recommendation on using .Matching() below
p.AllowAttrs("value").OnElements("li")

It is always recommended that an attribute be made to match a pattern. XSS in HTML attributes is very easy otherwise:

// \p{L} matches unicode letters, \p{N} matches unicode numbers
p.AllowAttrs("title").Matching(regexp.MustCompile(`[\p{L}\p{N}\s\-_',:\[\]!\./\\\(\)&]*`)).Globally()

You can stop at any time and call .Sanitize():

// string htmlIn passed in from a HTTP POST
htmlOut := p.Sanitize(htmlIn)

And you can take any existing policy and extend it:

p := bluemonday.UGCPolicy()
p.AllowElements("fieldset", "select", "option")
Inline CSS

Although it's possible to handle inline CSS using AllowAttrs with a Matching rule, writing a single monolithic regular expression to safely process all inline CSS which you wish to allow is not a trivial task. Instead of attempting to do so, you can whitelist the style attribute on whichever element(s) you desire and use style policies to control and sanitize inline styles.

It is suggested that you use Matching (with a suitable regular expression) MatchingEnum, or MatchingHandler to ensure each style matches your needs, but default handlers are supplied for most widely used styles.

Similar to attributes, you can allow specific CSS properties to be set inline:

p.AllowAttrs("style").OnElements("span", "p")
// Allow the 'color' property with valid RGB(A) hex values only (on any element allowed a 'style' attribute)
p.AllowStyles("color").Matching(regexp.MustCompile("(?i)^#([0-9a-f]{3,4}|[0-9a-f]{6}|[0-9a-f]{8})$")).Globally()

Additionally, you can allow a CSS property to be set only to an allowed value:

p.AllowAttrs("style").OnElements("span", "p")
// Allow the 'text-decoration' property to be set to 'underline', 'line-through' or 'none'
// on 'span' elements only
p.AllowStyles("text-decoration").MatchingEnum("underline", "line-through", "none").OnElements("span")

Or you can specify elements based on a regex patterm match:

p.AllowAttrs("style").OnElementsMatching(regex.MustCompile(`^my-element-`))
// Allow the 'text-decoration' property to be set to 'underline', 'line-through' or 'none'
// on 'span' elements only
p.AllowStyles("text-decoration").MatchingEnum("underline", "line-through", "none").OnElementsMatching(regex.MustCompile(`^my-element-`))

If you need more specific checking, you can create a handler that takes in a string and returns a bool to validate the values for a given property. The string parameter has been converted to lowercase and unicode code points have been converted.

myHandler := func(value string) bool{
	return true
}
p.AllowAttrs("style").OnElements("span", "p")
// Allow the 'color' property with values validated by the handler (on any element allowed a 'style' attribute)
p.AllowStyles("color").MatchingHandler(myHandler).Globally()

Links are difficult beasts to sanitise safely and also one of the biggest attack vectors for malicious content.

It is possible to do this:

p.AllowAttrs("href").Matching(regexp.MustCompile(`(?i)mailto|https?`)).OnElements("a")

But that will not protect you as the regular expression is insufficient in this case to have prevented a malformed value doing something unexpected.

We provide some additional global options for safely working with links.

RequireParseableURLs will ensure that URLs are parseable by Go's net/url package:

p.RequireParseableURLs(true)

If you have enabled parseable URLs then the following option will AllowRelativeURLs. By default this is disabled (bluemonday is a whitelist tool... you need to explicitly tell us to permit things) and when disabled it will prevent all local and scheme relative URLs (i.e. href="localpage.html", href="../home.html" and even href="//www.google.com" are relative):

p.AllowRelativeURLs(true)

If you have enabled parseable URLs then you can whitelist the schemes (commonly called protocol when thinking of http and https) that are permitted. Bear in mind that allowing relative URLs in the above option will allow for a blank scheme:

p.AllowURLSchemes("mailto", "http", "https")

Regardless of whether you have enabled parseable URLs, you can force all URLs to have a rel="nofollow" attribute. This will be added if it does not exist, but only when the href is valid:

// This applies to "a" "area" "link" elements that have a "href" attribute
p.RequireNoFollowOnLinks(true)

Similarly, you can force all URLs to have "noreferrer" in their rel attribute.

// This applies to "a" "area" "link" elements that have a "href" attribute
p.RequireNoReferrerOnLinks(true)

We provide a convenience method that applies all of the above, but you will still need to whitelist the linkable elements for the URL rules to be applied to:

p.AllowStandardURLs()
p.AllowAttrs("cite").OnElements("blockquote", "q")
p.AllowAttrs("href").OnElements("a", "area")
p.AllowAttrs("src").OnElements("img")

An additional complexity regarding links is the data URI as defined in RFC2397. The data URI allows for images to be served inline using this format:

<img src="">

We have provided a helper to verify the mimetype followed by base64 content of data URIs links:

p.AllowDataURIImages()

That helper will enable GIF, JPEG, PNG and WEBP images.

It should be noted that there is a potential security risk with the use of data URI links. You should only enable data URI links if you already trust the content.

We also have some features to help deal with user generated content:

p.AddTargetBlankToFullyQualifiedLinks(true)

This will ensure that anchor <a href="" /> links that are fully qualified (the href destination includes a host name) will get target="_blank" added to them.

Additionally any link that has target="_blank" after the policy has been applied will also have the rel attribute adjusted to add noopener. This means a link may start like <a href="//host/path"/> and will end up as <a href="//host/path" rel="noopener" target="_blank">. It is important to note that the addition of noopener is a security feature and not an issue. There is an unfortunate feature to browsers that a browser window opened as a result of target="_blank" can still control the opener (your web page) and this protects against that. The background to this can be found here: https://dev.to/ben/the-targetblank-vulnerability-by-example

Policy Building Helpers

We also bundle some helpers to simplify policy building:


// Permits the "dir", "id", "lang", "title" attributes globally
p.AllowStandardAttributes()

// Permits the "img" element and its standard attributes
p.AllowImages()

// Permits ordered and unordered lists, and also definition lists
p.AllowLists()

// Permits HTML tables and all applicable elements and non-styling attributes
p.AllowTables()
Invalid Instructions

The following are invalid:

// This does not say where the attributes are allowed, you need to add
// .Globally() or .OnElements(...)
// This will be ignored without error.
p.AllowAttrs("value")

// This does not say where the attributes are allowed, you need to add
// .Globally() or .OnElements(...)
// This will be ignored without error.
p.AllowAttrs(
	"type",
).Matching(
	regexp.MustCompile("(?i)^(circle|disc|square|a|A|i|I|1)$"),
)

Both examples exhibit the same issue, they declare attributes but do not then specify whether they are whitelisted globally or only on specific elements (and which elements). Attributes belong to one or more elements, and the policy needs to declare this.

Limitations

We are not yet including any tools to help whitelist and sanitize CSS. Which means that unless you wish to do the heavy lifting in a single regular expression (inadvisable), you should not allow the "style" attribute anywhere.

It is not the job of bluemonday to fix your bad HTML, it is merely the job of bluemonday to prevent malicious HTML getting through. If you have mismatched HTML elements, or non-conforming nesting of elements, those will remain. But if you have well-structured HTML bluemonday will not break it.

TODO

  • Investigate whether devs want to blacklist elements and attributes. This would allow devs to take an existing policy (such as the bluemonday.UGCPolicy() ) that encapsulates 90% of what they're looking for but does more than they need, and to remove the extra things they do not want to make it 100% what they want
  • Investigate whether devs want a validating HTML mode, in which the HTML elements are not just transformed into a balanced tree (every start tag has a closing tag at the correct depth) but also that elements and character data appear only in their allowed context (i.e. that a table element isn't a descendent of a caption, that colgroup, thead, tbody, tfoot and tr are permitted, and that character data is not permitted)

Development

If you have cloned this repo you will probably need the dependency:

go get golang.org/x/net/html

Gophers can use their familiar tools:

go build

go test

I personally use a Makefile as it spares typing the same args over and over whilst providing consistency for those of us who jump from language to language and enjoy just typing make in a project directory and watch magic happen.

make will build, vet, test and install the library.

make clean will remove the library from a single ${GOPATH}/pkg directory tree

make test will run the tests

make cover will run the tests and open a browser window with the coverage report

make lint will run golint (install via go get github.com/golang/lint/golint)

Long term goals

  1. Open the code to adversarial peer review similar to the Attack Review Ground Rules
  2. Raise funds and pay for an external security review

Documentation

Overview

Package bluemonday provides a way of describing a whitelist of HTML elements and attributes as a policy, and for that policy to be applied to untrusted strings from users that may contain markup. All elements and attributes not on the whitelist will be stripped.

The default bluemonday.UGCPolicy().Sanitize() turns this:

Hello <STYLE>.XSS{background-image:url("javascript:alert('XSS')");}</STYLE><A CLASS=XSS></A>World

Into the more harmless:

Hello World

And it turns this:

<a href="javascript:alert('XSS1')" onmouseover="alert('XSS2')">XSS<a>

Into this:

XSS

Whilst still allowing this:

<a href="http://www.google.com/">
  <img src="https://ssl.gstatic.com/accounts/ui/logo_2x.png"/>
</a>

To pass through mostly unaltered (it gained a rel="nofollow"):

<a href="http://www.google.com/" rel="nofollow">
  <img src="https://ssl.gstatic.com/accounts/ui/logo_2x.png"/>
</a>

The primary purpose of bluemonday is to take potentially unsafe user generated content (from things like Markdown, HTML WYSIWYG tools, etc) and make it safe for you to put on your website.

It protects sites against XSS (http://en.wikipedia.org/wiki/Cross-site_scripting) and other malicious content that a user interface may deliver. There are many vectors for an XSS attack (https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet) and the safest thing to do is to sanitize user input against a known safe list of HTML elements and attributes.

Note: You should always run bluemonday after any other processing.

If you use blackfriday (https://github.com/russross/blackfriday) or Pandoc (http://johnmacfarlane.net/pandoc/) then bluemonday should be run after these steps. This ensures that no insecure HTML is introduced later in your process.

bluemonday is heavily inspired by both the OWASP Java HTML Sanitizer (https://code.google.com/p/owasp-java-html-sanitizer/) and the HTML Purifier (http://htmlpurifier.org/).

We ship two default policies, one is bluemonday.StrictPolicy() and can be thought of as equivalent to stripping all HTML elements and their attributes as it has nothing on its whitelist.

The other is bluemonday.UGCPolicy() and allows a broad selection of HTML elements and attributes that are safe for user generated content. Note that this policy does not whitelist iframes, object, embed, styles, script, etc.

The essence of building a policy is to determine which HTML elements and attributes are considered safe for your scenario. OWASP provide an XSS prevention cheat sheet ( https://www.google.com/search?q=xss+prevention+cheat+sheet ) to help explain the risks, but essentially:

  1. Avoid whitelisting anything other than plain HTML elements
  2. Avoid whitelisting `script`, `style`, `iframe`, `object`, `embed`, `base` elements
  3. Avoid whitelisting anything other than plain HTML elements with simple values that you can match to a regexp
Example
package main

import (
	"fmt"
	"regexp"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// Create a new policy
	p := bluemonday.NewPolicy()

	// Add elements to a policy without attributes
	p.AllowElements("b", "strong")

	// Add elements as a virtue of adding an attribute
	p.AllowAttrs("nowrap").OnElements("td", "th")

	// Attributes can either be added to all elements
	p.AllowAttrs("dir").Globally()

	//Or attributes can be added to specific elements
	p.AllowAttrs("value").OnElements("li")

	// It is ALWAYS recommended that an attribute be made to match a pattern
	// XSS in HTML attributes is a very easy attack vector

	// \p{L} matches unicode letters, \p{N} matches unicode numbers
	p.AllowAttrs("title").Matching(regexp.MustCompile(`[\p{L}\p{N}\s\-_',:\[\]!\./\\\(\)&]*`)).Globally()

	// You can stop at any time and call .Sanitize()

	// Assumes that string htmlIn was passed in from a HTTP POST and contains
	// untrusted user generated content
	htmlIn := `untrusted user generated content <body onload="alert('XSS')">`
	fmt.Println(p.Sanitize(htmlIn))

	// And you can take any existing policy and extend it
	p = bluemonday.UGCPolicy()
	p.AllowElements("fieldset", "select", "option")

	// Links are complex beasts and one of the biggest attack vectors for
	// malicious content so we have included features specifically to help here.

	// This is not recommended:
	p = bluemonday.NewPolicy()
	p.AllowAttrs("href").Matching(regexp.MustCompile(`(?i)mailto|https?`)).OnElements("a")

	// The regexp is insufficient in this case to have prevented a malformed
	// value doing something unexpected.

	// This will ensure that URLs are not considered invalid by Go's net/url
	// package.
	p.RequireParseableURLs(true)

	// If you have enabled parseable URLs then the following option will allow
	// relative URLs. By default this is disabled and will prevent all local and
	// schema relative URLs (i.e. `href="//www.google.com"` is schema relative).
	p.AllowRelativeURLs(true)

	// If you have enabled parseable URLs then you can whitelist the schemas
	// that are permitted. Bear in mind that allowing relative URLs in the above
	// option allows for blank schemas.
	p.AllowURLSchemes("mailto", "http", "https")

	// Regardless of whether you have enabled parseable URLs, you can force all
	// URLs to have a rel="nofollow" attribute. This will be added if it does
	// not exist.

	// This applies to "a" "area" "link" elements that have a "href" attribute
	p.RequireNoFollowOnLinks(true)

	// We provide a convenience function that applies all of the above, but you
	// will still need to whitelist the linkable elements:
	p = bluemonday.NewPolicy()
	p.AllowStandardURLs()
	p.AllowAttrs("cite").OnElements("blockquote")
	p.AllowAttrs("href").OnElements("a", "area")
	p.AllowAttrs("src").OnElements("img")

	// Policy Building Helpers

	// If you've got this far and you're bored already, we also bundle some
	// other convenience functions
	p = bluemonday.NewPolicy()
	p.AllowStandardAttributes()
	p.AllowImages()
	p.AllowLists()
	p.AllowTables()
}
Output:

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	// CellAlign handles the `align` attribute
	// https://developer.mozilla.org/en-US/docs/Web/HTML/Element/td#attr-align
	CellAlign = regexp.MustCompile(`(?i)^(center|justify|left|right|char)$`)

	// CellVerticalAlign handles the `valign` attribute
	// https://developer.mozilla.org/en-US/docs/Web/HTML/Element/td#attr-valign
	CellVerticalAlign = regexp.MustCompile(`(?i)^(baseline|bottom|middle|top)$`)

	// Direction handles the `dir` attribute
	// https://developer.mozilla.org/en-US/docs/Web/HTML/Element/bdo#attr-dir
	Direction = regexp.MustCompile(`(?i)^(rtl|ltr)$`)

	// ImageAlign handles the `align` attribute on the `image` tag
	// http://www.w3.org/MarkUp/Test/Img/imgtest.html
	ImageAlign = regexp.MustCompile(
		`(?i)^(left|right|top|texttop|middle|absmiddle|baseline|bottom|absbottom)$`,
	)

	// Integer describes whole positive integers (including 0) used in places
	// like td.colspan
	// https://developer.mozilla.org/en-US/docs/Web/HTML/Element/td#attr-colspan
	Integer = regexp.MustCompile(`^[0-9]+$`)

	// ISO8601 according to the W3 group is only a subset of the ISO8601
	// standard: http://www.w3.org/TR/NOTE-datetime
	//
	// Used in places like time.datetime
	// https://developer.mozilla.org/en-US/docs/Web/HTML/Element/time#attr-datetime
	//
	// Matches patterns:
	//  Year:
	//     YYYY (eg 1997)
	//  Year and month:
	//     YYYY-MM (eg 1997-07)
	//  Complete date:
	//     YYYY-MM-DD (eg 1997-07-16)
	//  Complete date plus hours and minutes:
	//     YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
	//  Complete date plus hours, minutes and seconds:
	//     YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
	//  Complete date plus hours, minutes, seconds and a decimal fraction of a
	//  second
	//      YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)
	ISO8601 = regexp.MustCompile(
		`^[0-9]{4}(-[0-9]{2}(-[0-9]{2}([ T][0-9]{2}(:[0-9]{2}){1,2}(.[0-9]{1,6})` +
			`?Z?([\+-][0-9]{2}:[0-9]{2})?)?)?)?$`,
	)

	// ListType encapsulates the common value as well as the latest spec
	// values for lists
	// https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ol#attr-type
	ListType = regexp.MustCompile(`(?i)^(circle|disc|square|a|A|i|I|1)$`)

	// SpaceSeparatedTokens is used in places like `a.rel` and the common attribute
	// `class` which both contain space delimited lists of data tokens
	// http://www.w3.org/TR/html-markup/datatypes.html#common.data.tokens-def
	// Regexp: \p{L} matches unicode letters, \p{N} matches unicode numbers
	SpaceSeparatedTokens = regexp.MustCompile(`^([\s\p{L}\p{N}_-]+)$`)

	// Number is a double value used on HTML5 meter and progress elements
	// http://www.whatwg.org/specs/web-apps/current-work/multipage/the-button-element.html#the-meter-element
	Number = regexp.MustCompile(`^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$`)

	// NumberOrPercent is used predominantly as units of measurement in width
	// and height attributes
	// https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img#attr-height
	NumberOrPercent = regexp.MustCompile(`^[0-9]+[%]?$`)

	// Paragraph of text in an attribute such as *.'title', img.alt, etc
	// https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes#attr-title
	// Note that we are not allowing chars that could close tags like '>'
	Paragraph = regexp.MustCompile(`^[\p{L}\p{N}\s\-_',\[\]!\./\\\(\)]*$`)
)

A selection of regular expressions that can be used as .Matching() rules on HTML attributes.

Functions

func AlignContentHandler added in v1.0.3

func AlignContentHandler(value string) bool

func AlignItemsHandler added in v1.0.3

func AlignItemsHandler(value string) bool

func AlignSelfHandler added in v1.0.3

func AlignSelfHandler(value string) bool

func AllHandler added in v1.0.3

func AllHandler(value string) bool

func AnimationDelayHandler added in v1.0.3

func AnimationDelayHandler(value string) bool

func AnimationDirectionHandler added in v1.0.3

func AnimationDirectionHandler(value string) bool

func AnimationDurationHandler added in v1.0.3

func AnimationDurationHandler(value string) bool

func AnimationFillModeHandler added in v1.0.3

func AnimationFillModeHandler(value string) bool

func AnimationHandler added in v1.0.3

func AnimationHandler(value string) bool

func AnimationIterationCountHandler added in v1.0.3

func AnimationIterationCountHandler(value string) bool

func AnimationNameHandler added in v1.0.3

func AnimationNameHandler(value string) bool

func AnimationPlayStateHandler added in v1.0.3

func AnimationPlayStateHandler(value string) bool

func BackfaceVisibilityHandler added in v1.0.3

func BackfaceVisibilityHandler(value string) bool

func BackgroundAttachmentHandler added in v1.0.3

func BackgroundAttachmentHandler(value string) bool

func BackgroundBlendModeHandler added in v1.0.3

func BackgroundBlendModeHandler(value string) bool

func BackgroundClipHandler added in v1.0.3

func BackgroundClipHandler(value string) bool

func BackgroundHandler added in v1.0.3

func BackgroundHandler(value string) bool

func BackgroundOriginHandler added in v1.0.3

func BackgroundOriginHandler(value string) bool

func BackgroundPositionHandler added in v1.0.3

func BackgroundPositionHandler(value string) bool

func BackgroundRepeatHandler added in v1.0.3

func BackgroundRepeatHandler(value string) bool

func BackgroundSizeHandler added in v1.0.3

func BackgroundSizeHandler(value string) bool

func BaseHandler added in v1.0.3

func BaseHandler(value string) bool

func BorderCollapseHandler added in v1.0.3

func BorderCollapseHandler(value string) bool

func BorderHandler added in v1.0.3

func BorderHandler(value string) bool

func BorderImageHandler added in v1.0.3

func BorderImageHandler(value string) bool

func BorderImageOutsetHandler added in v1.0.3

func BorderImageOutsetHandler(value string) bool

func BorderImageRepeatHandler added in v1.0.3

func BorderImageRepeatHandler(value string) bool

func BorderImageSliceHandler added in v1.0.3

func BorderImageSliceHandler(value string) bool

func BorderImageWidthHandler added in v1.0.3

func BorderImageWidthHandler(value string) bool

func BorderRadiusHandler added in v1.0.3

func BorderRadiusHandler(value string) bool

func BorderSideHandler added in v1.0.3

func BorderSideHandler(value string) bool

func BorderSideRadiusHandler added in v1.0.3

func BorderSideRadiusHandler(value string) bool

func BorderSideStyleHandler added in v1.0.3

func BorderSideStyleHandler(value string) bool

func BorderSideWidthHandler added in v1.0.3

func BorderSideWidthHandler(value string) bool

func BorderSpacingHandler added in v1.0.3

func BorderSpacingHandler(value string) bool

func BorderStyleHandler added in v1.0.3

func BorderStyleHandler(value string) bool

func BorderWidthHandler added in v1.0.3

func BorderWidthHandler(value string) bool

func BoxDecorationBreakHandler added in v1.0.3

func BoxDecorationBreakHandler(value string) bool

func BoxShadowHandler added in v1.0.3

func BoxShadowHandler(value string) bool

func BoxSizingHandler added in v1.0.3

func BoxSizingHandler(value string) bool

func BreakBeforeAfterHandler added in v1.0.3

func BreakBeforeAfterHandler(value string) bool

func BreakInsideHandler added in v1.0.3

func BreakInsideHandler(value string) bool

func CaptionSideHandler added in v1.0.3

func CaptionSideHandler(value string) bool

func CaretColorHandler added in v1.0.3

func CaretColorHandler(value string) bool

func ClearHandler added in v1.0.3

func ClearHandler(value string) bool

func ClipHandler added in v1.0.3

func ClipHandler(value string) bool

func ColorHandler added in v1.0.3

func ColorHandler(value string) bool

func ColumnCountHandler added in v1.0.3

func ColumnCountHandler(value string) bool

func ColumnFillHandler added in v1.0.3

func ColumnFillHandler(value string) bool

func ColumnGapHandler added in v1.0.3

func ColumnGapHandler(value string) bool

func ColumnRuleHandler added in v1.0.3

func ColumnRuleHandler(value string) bool

func ColumnRuleWidthHandler added in v1.0.3

func ColumnRuleWidthHandler(value string) bool

func ColumnSpanHandler added in v1.0.3

func ColumnSpanHandler(value string) bool

func ColumnWidthHandler added in v1.0.3

func ColumnWidthHandler(value string) bool

func ColumnsHandler added in v1.0.3

func ColumnsHandler(value string) bool

func CursorHandler added in v1.0.3

func CursorHandler(value string) bool

func DirectionHandler added in v1.0.3

func DirectionHandler(value string) bool

func DisplayHandler added in v1.0.3

func DisplayHandler(value string) bool

func EmptyCellsHandler added in v1.0.3

func EmptyCellsHandler(value string) bool

func FilterHandler added in v1.0.3

func FilterHandler(value string) bool

func FlexBasisHandler added in v1.0.3

func FlexBasisHandler(value string) bool

func FlexDirectionHandler added in v1.0.3

func FlexDirectionHandler(value string) bool

func FlexFlowHandler added in v1.0.3

func FlexFlowHandler(value string) bool

func FlexGrowHandler added in v1.0.3

func FlexGrowHandler(value string) bool

func FlexHandler added in v1.0.3

func FlexHandler(value string) bool

func FlexWrapHandler added in v1.0.3

func FlexWrapHandler(value string) bool

func FloatHandler added in v1.0.3

func FloatHandler(value string) bool

func FontFamilyHandler added in v1.0.3

func FontFamilyHandler(value string) bool

func FontHandler added in v1.0.3

func FontHandler(value string) bool

func FontKerningHandler added in v1.0.3

func FontKerningHandler(value string) bool

func FontLanguageOverrideHandler added in v1.0.3

func FontLanguageOverrideHandler(value string) bool

func FontSizeAdjustHandler added in v1.0.3

func FontSizeAdjustHandler(value string) bool

func FontSizeHandler added in v1.0.3

func FontSizeHandler(value string) bool

func FontStretchHandler added in v1.0.3

func FontStretchHandler(value string) bool

func FontStyleHandler added in v1.0.3

func FontStyleHandler(value string) bool

func FontSynthesisHandler added in v1.0.3

func FontSynthesisHandler(value string) bool

func FontVariantCapsHandler added in v1.0.3

func FontVariantCapsHandler(value string) bool

func FontVariantHandler added in v1.0.3

func FontVariantHandler(value string) bool

func FontVariantPositionHandler added in v1.0.3

func FontVariantPositionHandler(value string) bool

func FontWeightHandler added in v1.0.3

func FontWeightHandler(value string) bool

func GridAreaHandler added in v1.0.3

func GridAreaHandler(value string) bool

func GridAutoColumnsHandler added in v1.0.3

func GridAutoColumnsHandler(value string) bool

func GridAutoFlowHandler added in v1.0.3

func GridAutoFlowHandler(value string) bool

func GridAxisStartEndHandler added in v1.0.3

func GridAxisStartEndHandler(value string) bool

func GridColumnGapHandler added in v1.0.3

func GridColumnGapHandler(value string) bool

func GridColumnHandler added in v1.0.3

func GridColumnHandler(value string) bool

func GridGapHandler added in v1.0.3

func GridGapHandler(value string) bool

func GridHandler added in v1.0.3

func GridHandler(value string) bool

func GridRowHandler added in v1.0.3

func GridRowHandler(value string) bool

func GridTemplateAreasHandler added in v1.0.3

func GridTemplateAreasHandler(value string) bool

func GridTemplateColumnsHandler added in v1.0.3

func GridTemplateColumnsHandler(value string) bool

func GridTemplateHandler added in v1.0.3

func GridTemplateHandler(value string) bool

func GridTemplateRowsHandler added in v1.0.3

func GridTemplateRowsHandler(value string) bool

func HangingPunctuationHandler added in v1.0.3

func HangingPunctuationHandler(value string) bool

func HeightHandler added in v1.0.3

func HeightHandler(value string) bool

func HyphensHandler added in v1.0.3

func HyphensHandler(value string) bool

func ImageHandler added in v1.0.3

func ImageHandler(value string) bool

func ImageRenderingHandler added in v1.0.3

func ImageRenderingHandler(value string) bool

func IsolationHandler added in v1.0.3

func IsolationHandler(value string) bool

func JustifyContentHandler added in v1.0.3

func JustifyContentHandler(value string) bool

func LengthHandler added in v1.0.3

func LengthHandler(value string) bool

func LetterSpacingHandler added in v1.0.3

func LetterSpacingHandler(value string) bool

func LineBreakHandler added in v1.0.3

func LineBreakHandler(value string) bool

func LineHeightHandler added in v1.0.3

func LineHeightHandler(value string) bool

func ListStyleHandler added in v1.0.3

func ListStyleHandler(value string) bool

func ListStylePositionHandler added in v1.0.3

func ListStylePositionHandler(value string) bool

func ListStyleTypeHandler added in v1.0.3

func ListStyleTypeHandler(value string) bool

func MarginHandler added in v1.0.3

func MarginHandler(value string) bool

func MarginSideHandler added in v1.0.3

func MarginSideHandler(value string) bool

func MaxHeightWidthHandler added in v1.0.3

func MaxHeightWidthHandler(value string) bool

func MinHeightWidthHandler added in v1.0.3

func MinHeightWidthHandler(value string) bool

func MixBlendModeHandler added in v1.0.3

func MixBlendModeHandler(value string) bool

func ObjectFitHandler added in v1.0.3

func ObjectFitHandler(value string) bool

func ObjectPositionHandler added in v1.0.3

func ObjectPositionHandler(value string) bool

func OpacityHandler added in v1.0.3

func OpacityHandler(value string) bool

func OrderHandler added in v1.0.3

func OrderHandler(value string) bool

func OrphansHandler added in v1.0.3

func OrphansHandler(value string) bool

func OutlineHandler added in v1.0.3

func OutlineHandler(value string) bool

func OutlineOffsetHandler added in v1.0.3

func OutlineOffsetHandler(value string) bool

func OutlineStyleHandler added in v1.0.3

func OutlineStyleHandler(value string) bool

func OutlineWidthHandler added in v1.0.3

func OutlineWidthHandler(value string) bool

func OverflowHandler added in v1.0.3

func OverflowHandler(value string) bool

func OverflowWrapHandler added in v1.0.3

func OverflowWrapHandler(value string) bool

func OverflowXYHandler added in v1.0.3

func OverflowXYHandler(value string) bool

func PaddingHandler added in v1.0.3

func PaddingHandler(value string) bool

func PaddingSideHandler added in v1.0.3

func PaddingSideHandler(value string) bool

func PageBreakBeforeAfterHandler added in v1.0.3

func PageBreakBeforeAfterHandler(value string) bool

func PageBreakInsideHandler added in v1.0.3

func PageBreakInsideHandler(value string) bool

func PerspectiveHandler added in v1.0.3

func PerspectiveHandler(value string) bool

func PerspectiveOriginHandler added in v1.0.3

func PerspectiveOriginHandler(value string) bool

func PointerEventsHandler added in v1.0.3

func PointerEventsHandler(value string) bool

func PositionHandler added in v1.0.3

func PositionHandler(value string) bool

func QuotesHandler added in v1.0.3

func QuotesHandler(value string) bool

func ResizeHandler added in v1.0.3

func ResizeHandler(value string) bool

func ScrollBehaviorHandler added in v1.0.3

func ScrollBehaviorHandler(value string) bool

func SideHandler added in v1.0.3

func SideHandler(value string) bool

func TabSizeHandler added in v1.0.3

func TabSizeHandler(value string) bool

func TableLayoutHandler added in v1.0.3

func TableLayoutHandler(value string) bool

func TextAlignHandler added in v1.0.3

func TextAlignHandler(value string) bool

func TextAlignLastHandler added in v1.0.3

func TextAlignLastHandler(value string) bool

func TextCombineUprightHandler added in v1.0.3

func TextCombineUprightHandler(value string) bool

func TextDecorationHandler added in v1.0.3

func TextDecorationHandler(value string) bool

func TextDecorationLineHandler added in v1.0.3

func TextDecorationLineHandler(value string) bool

func TextDecorationStyleHandler added in v1.0.3

func TextDecorationStyleHandler(value string) bool

func TextIndentHandler added in v1.0.3

func TextIndentHandler(value string) bool

func TextJustifyHandler added in v1.0.3

func TextJustifyHandler(value string) bool

func TextOrientationHandler added in v1.0.3

func TextOrientationHandler(value string) bool

func TextOverflowHandler added in v1.0.3

func TextOverflowHandler(value string) bool

func TextShadowHandler added in v1.0.3

func TextShadowHandler(value string) bool

func TextTransformHandler added in v1.0.3

func TextTransformHandler(value string) bool

func TimingFunctionHandler added in v1.0.3

func TimingFunctionHandler(value string) bool

func TransformHandler added in v1.0.3

func TransformHandler(value string) bool

func TransformOriginHandler added in v1.0.3

func TransformOriginHandler(value string) bool

func TransformStyleHandler added in v1.0.3

func TransformStyleHandler(value string) bool

func TransitionDelayHandler added in v1.0.3

func TransitionDelayHandler(value string) bool

func TransitionDurationHandler added in v1.0.3

func TransitionDurationHandler(value string) bool

func TransitionHandler added in v1.0.3

func TransitionHandler(value string) bool

func TransitionPropertyHandler added in v1.0.3

func TransitionPropertyHandler(value string) bool

func UnicodeBidiHandler added in v1.0.3

func UnicodeBidiHandler(value string) bool

func UserSelectHandler added in v1.0.3

func UserSelectHandler(value string) bool

func VerticalAlignHandler added in v1.0.3

func VerticalAlignHandler(value string) bool

func VisiblityHandler added in v1.0.3

func VisiblityHandler(value string) bool

func WhiteSpaceHandler added in v1.0.3

func WhiteSpaceHandler(value string) bool

func WidthHandler added in v1.0.3

func WidthHandler(value string) bool

func WordBreakHandler added in v1.0.3

func WordBreakHandler(value string) bool

func WordSpacingHandler added in v1.0.3

func WordSpacingHandler(value string) bool

func WordWrapHandler added in v1.0.3

func WordWrapHandler(value string) bool

func WritingModeHandler added in v1.0.3

func WritingModeHandler(value string) bool

func ZIndexHandler added in v1.0.3

func ZIndexHandler(value string) bool

Types

type Policy

type Policy struct {
	// contains filtered or unexported fields
}

Policy encapsulates the whitelist of HTML elements and attributes that will be applied to the sanitised HTML.

You should use bluemonday.NewPolicy() to create a blank policy as the unexported fields contain maps that need to be initialized.

func NewPolicy

func NewPolicy() *Policy

NewPolicy returns a blank policy with nothing whitelisted or permitted. This is the recommended way to start building a policy and you should now use AllowAttrs() and/or AllowElements() to construct the whitelist of HTML elements and attributes.

Example
package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// NewPolicy is a blank policy and we need to explicitly whitelist anything
	// that we wish to allow through
	p := bluemonday.NewPolicy()

	// We ensure any URLs are parseable and have rel="nofollow" where applicable
	p.AllowStandardURLs()

	// AllowStandardURLs already ensures that the href will be valid, and so we
	// can skip the .Matching()
	p.AllowAttrs("href").OnElements("a")

	// We allow paragraphs too
	p.AllowElements("p")

	html := p.Sanitize(
		`<p><a onblur="alert(secret)" href="http://www.google.com">Google</a></p>`,
	)

	fmt.Println(html)

}
Output:

<p><a href="http://www.google.com" rel="nofollow">Google</a></p>

func StrictPolicy

func StrictPolicy() *Policy

StrictPolicy returns an empty policy, which will effectively strip all HTML elements and their attributes from a document.

Example
package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// StrictPolicy is equivalent to NewPolicy and as nothing else is declared
	// we are stripping all elements (and their attributes)
	p := bluemonday.StrictPolicy()

	html := p.Sanitize(
		`Goodbye <a onblur="alert(secret)" href="http://en.wikipedia.org/wiki/Goodbye_Cruel_World_(Pink_Floyd_song)">Cruel</a> World`,
	)

	fmt.Println(html)

}
Output:

Goodbye Cruel World

func StripTagsPolicy

func StripTagsPolicy() *Policy

StripTagsPolicy is DEPRECATED. Use StrictPolicy instead.

func UGCPolicy

func UGCPolicy() *Policy

UGCPolicy returns a policy aimed at user generated content that is a result of HTML WYSIWYG tools and Markdown conversions.

This is expected to be a fairly rich document where as much markup as possible should be retained. Markdown permits raw HTML so we are basically providing a policy to sanitise HTML5 documents safely but with the least intrusion on the formatting expectations of the user.

Example
package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// UGCPolicy is a convenience policy for user generated content.
	p := bluemonday.UGCPolicy()

	html := p.Sanitize(
		`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`,
	)

	fmt.Println(html)

}
Output:

<a href="http://www.google.com" rel="nofollow">Google</a>

func (*Policy) AddSpaceWhenStrippingTag

func (p *Policy) AddSpaceWhenStrippingTag(allow bool) *Policy

AddSpaceWhenStrippingTag states whether to add a single space " " when removing tags that are not whitelisted by the policy.

This is useful if you expect to strip tags in dense markup and may lose the value of whitespace.

For example: "<p>Hello</p><p>World</p>"" would be sanitized to "HelloWorld" with the default value of false, but you may wish to sanitize this to " Hello World " by setting AddSpaceWhenStrippingTag to true as this would retain the intent of the text.

func (p *Policy) AddTargetBlankToFullyQualifiedLinks(require bool) *Policy

AddTargetBlankToFullyQualifiedLinks will result in all a, area and link tags that point to a non-local destination (i.e. starts with a protocol and has a host) having a target="_blank" added to them if one does not already exist

Note: This requires p.RequireParseableURLs(true) and will enable it.

func (*Policy) AllowAttrs

func (p *Policy) AllowAttrs(attrNames ...string) *attrPolicyBuilder

AllowAttrs takes a range of HTML attribute names and returns an attribute policy builder that allows you to specify the pattern and scope of the whitelisted attribute.

The attribute policy is only added to the core policy when either Globally() or OnElements(...) are called.

Example
package main

import (
	"github.com/microcosm-cc/bluemonday"
)

func main() {
	p := bluemonday.NewPolicy()

	// Allow the 'title' attribute on every HTML element that has been
	// whitelisted
	p.AllowAttrs("title").Matching(bluemonday.Paragraph).Globally()

	// Allow the 'abbr' attribute on only the 'td' and 'th' elements.
	p.AllowAttrs("abbr").Matching(bluemonday.Paragraph).OnElements("td", "th")

	// Allow the 'colspan' and 'rowspan' attributes, matching a positive integer
	// pattern, on only the 'td' and 'th' elements.
	p.AllowAttrs("colspan", "rowspan").Matching(
		bluemonday.Integer,
	).OnElements("td", "th")
}
Output:

func (*Policy) AllowDataAttributes

func (p *Policy) AllowDataAttributes()

AllowDataAttributes whitelists all data attributes. We can't specify the name of each attribute exactly as they are customized.

NOTE: These values are not sanitized and applications that evaluate or process them without checking and verification of the input may be at risk if this option is enabled. This is a 'caveat emptor' option and the person enabling this option needs to fully understand the potential impact with regards to whatever application will be consuming the sanitized HTML afterwards, i.e. if you know you put a link in a data attribute and use that to automatically load some new window then you're giving the author of a HTML fragment the means to open a malicious destination automatically. Use with care!

func (*Policy) AllowDataURIImages

func (p *Policy) AllowDataURIImages()

AllowDataURIImages permits the use of inline images defined in RFC2397 http://tools.ietf.org/html/rfc2397 http://en.wikipedia.org/wiki/Data_URI_scheme

Images must have a mimetype matching:

image/gif
image/jpeg
image/png
image/webp

NOTE: There is a potential security risk to allowing data URIs and you should only permit them on content you already trust. http://palizine.plynt.com/issues/2010Oct/bypass-xss-filters/ https://capec.mitre.org/data/definitions/244.html

func (*Policy) AllowElements

func (p *Policy) AllowElements(names ...string) *Policy

AllowElements will append HTML elements to the whitelist without applying an attribute policy to those elements (the elements are permitted sans-attributes)

Example
package main

import (
	"github.com/microcosm-cc/bluemonday"
)

func main() {
	p := bluemonday.NewPolicy()

	// Allow styling elements without attributes
	p.AllowElements("br", "div", "hr", "p", "span")
}
Output:

func (*Policy) AllowElementsContent

func (p *Policy) AllowElementsContent(names ...string) *Policy

AllowElementsContent marks the HTML elements whose content should be retained after removing the tag.

func (*Policy) AllowElementsMatching added in v1.0.3

func (p *Policy) AllowElementsMatching(regex *regexp.Regexp) *Policy

func (*Policy) AllowImages

func (p *Policy) AllowImages()

AllowImages enables the img element and some popular attributes. It will also ensure that URL values are parseable. This helper does not enable data URI images, for that you should also use the AllowDataURIImages() helper.

func (*Policy) AllowLists

func (p *Policy) AllowLists()

AllowLists will enabled ordered and unordered lists, as well as definition lists

func (*Policy) AllowNoAttrs

func (p *Policy) AllowNoAttrs() *attrPolicyBuilder

AllowNoAttrs says that attributes on element are optional.

The attribute policy is only added to the core policy when OnElements(...) are called.

func (*Policy) AllowRelativeURLs

func (p *Policy) AllowRelativeURLs(require bool) *Policy

AllowRelativeURLs enables RequireParseableURLs and then permits URLs that are parseable, have no schema information and url.IsAbs() returns false This permits local URLs

func (*Policy) AllowStandardAttributes

func (p *Policy) AllowStandardAttributes()

AllowStandardAttributes will enable "id", "title" and the language specific attributes "dir" and "lang" on all elements that are whitelisted

func (*Policy) AllowStandardURLs

func (p *Policy) AllowStandardURLs()

AllowStandardURLs is a convenience function that will enable rel="nofollow" on "a", "area" and "link" (if you have allowed those elements) and will ensure that the URL values are parseable and either relative or belong to the "mailto", "http", or "https" schemes

func (*Policy) AllowStyles added in v1.0.3

func (p *Policy) AllowStyles(propertyNames ...string) *stylePolicyBuilder

AllowStyles takes a range of CSS property names and returns a style policy builder that allows you to specify the pattern and scope of the whitelisted property.

The style policy is only added to the core policy when either Globally() or OnElements(...) are called.

Example
package main

import (
	"fmt"
	"regexp"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	p := bluemonday.NewPolicy()

	// Allow only 'span' and 'p' elements
	p.AllowElements("span", "p", "strong")

	// Only allow 'style' attributes on 'span' and 'p' elements
	p.AllowAttrs("style").OnElements("span", "p")

	// Allow the 'text-decoration' property to be set to 'underline', 'line-through' or 'none'
	// on 'span' elements only
	p.AllowStyles("text-decoration").MatchingEnum("underline", "line-through", "none").OnElements("span")

	// Allow the 'color' property with valid RGB(A) hex values only
	// on every HTML element that has been whitelisted
	p.AllowStyles("color").Matching(regexp.MustCompile("(?i)^#([0-9a-f]{3,4}|[0-9a-f]{6}|[0-9a-f]{8})$")).Globally()

	// Default handler
	p.AllowStyles("background-origin").Globally()

	// The span has an invalid 'color' which will be stripped along with other disallowed properties
	html := p.Sanitize(
		`<p style="color:#f00;">
	<span style="text-decoration: underline; background-image: url(javascript:alert('XSS')); color: #f00ba; background-origin: invalidValue">
		Red underlined <strong style="text-decoration:none;">text</strong>
	</span>
</p>`,
	)

	fmt.Println(html)

}
Output:

<p style="color: #f00">
	<span style="text-decoration: underline">
		Red underlined <strong>text</strong>
	</span>
</p>

func (*Policy) AllowStyling

func (p *Policy) AllowStyling()

AllowStyling presently enables the class attribute globally.

Note: When bluemonday ships a CSS parser and we can safely sanitise that, this will also allow sanitized styling of elements via the style attribute.

func (*Policy) AllowTables

func (p *Policy) AllowTables()

AllowTables will enable a rich set of elements and attributes to describe HTML tables

func (*Policy) AllowURLSchemeWithCustomPolicy

func (p *Policy) AllowURLSchemeWithCustomPolicy(
	scheme string,
	urlPolicy func(url *url.URL) (allowUrl bool),
) *Policy

AllowURLSchemeWithCustomPolicy will append URL schemes with a custom URL policy to the whitelist. Only the URLs with matching schema and urlPolicy(url) returning true will be allowed.

func (*Policy) AllowURLSchemes

func (p *Policy) AllowURLSchemes(schemes ...string) *Policy

AllowURLSchemes will append URL schemes to the whitelist Example: p.AllowURLSchemes("mailto", "http", "https")

func (*Policy) RequireCrossOriginAnonymous added in v1.0.6

func (p *Policy) RequireCrossOriginAnonymous(require bool) *Policy

RequireCrossOriginAnonymous will result in all audio, img, link, script, and video tags having a crossorigin="anonymous" added to them if one does not already exist

func (p *Policy) RequireNoFollowOnFullyQualifiedLinks(require bool) *Policy

RequireNoFollowOnFullyQualifiedLinks will result in all a, area, and link tags that point to a non-local destination (i.e. starts with a protocol and has a host) having a rel="nofollow" added to them if one does not already exist

Note: This requires p.RequireParseableURLs(true) and will enable it.

func (p *Policy) RequireNoFollowOnLinks(require bool) *Policy

RequireNoFollowOnLinks will result in all a, area, link tags having a rel="nofollow"added to them if one does not already exist

Note: This requires p.RequireParseableURLs(true) and will enable it.

func (p *Policy) RequireNoReferrerOnFullyQualifiedLinks(require bool) *Policy

RequireNoReferrerOnFullyQualifiedLinks will result in all a, area, and link tags that point to a non-local destination (i.e. starts with a protocol and has a host) having a rel="noreferrer" added to them if one does not already exist

Note: This requires p.RequireParseableURLs(true) and will enable it.

func (p *Policy) RequireNoReferrerOnLinks(require bool) *Policy

RequireNoReferrerOnLinks will result in all a, area, and link tags having a rel="noreferrrer" added to them if one does not already exist

Note: This requires p.RequireParseableURLs(true) and will enable it.

func (*Policy) RequireParseableURLs

func (p *Policy) RequireParseableURLs(require bool) *Policy

RequireParseableURLs will result in all URLs requiring that they be parseable by "net/url" url.Parse() This applies to: - a.href - area.href - blockquote.cite - img.src - link.href - script.src

func (*Policy) Sanitize

func (p *Policy) Sanitize(s string) string

Sanitize takes a string that contains a HTML fragment or document and applies the given policy whitelist.

It returns a HTML string that has been sanitized by the policy or an empty string if an error has occurred (most likely as a consequence of extremely malformed input)

Example
package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// UGCPolicy is a convenience policy for user generated content.
	p := bluemonday.UGCPolicy()

	// string in, string out
	html := p.Sanitize(`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`)

	fmt.Println(html)

}
Output:

<a href="http://www.google.com" rel="nofollow">Google</a>

func (*Policy) SanitizeBytes

func (p *Policy) SanitizeBytes(b []byte) []byte

SanitizeBytes takes a []byte that contains a HTML fragment or document and applies the given policy whitelist.

It returns a []byte containing the HTML that has been sanitized by the policy or an empty []byte if an error has occurred (most likely as a consequence of extremely malformed input)

Example
package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// UGCPolicy is a convenience policy for user generated content.
	p := bluemonday.UGCPolicy()

	// []byte in, []byte out
	b := []byte(`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`)
	b = p.SanitizeBytes(b)

	fmt.Println(string(b))

}
Output:

<a href="http://www.google.com" rel="nofollow">Google</a>

func (*Policy) SanitizeReader

func (p *Policy) SanitizeReader(r io.Reader) *bytes.Buffer

SanitizeReader takes an io.Reader that contains a HTML fragment or document and applies the given policy whitelist.

It returns a bytes.Buffer containing the HTML that has been sanitized by the policy. Errors during sanitization will merely return an empty result.

Example
package main

import (
	"fmt"
	"strings"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// UGCPolicy is a convenience policy for user generated content.
	p := bluemonday.UGCPolicy()

	// io.Reader in, bytes.Buffer out
	r := strings.NewReader(`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`)
	buf := p.SanitizeReader(r)

	fmt.Println(buf.String())

}
Output:

<a href="http://www.google.com" rel="nofollow">Google</a>

func (*Policy) SkipElementsContent

func (p *Policy) SkipElementsContent(names ...string) *Policy

SkipElementsContent adds the HTML elements whose tags is needed to be removed with its content.

type Query added in v1.0.6

type Query struct {
	Key   string
	Value string
}

Query represents a query

Directories

Path Synopsis
cmd
sanitise_html_email
Package main demonstrates a HTML email cleaner.
Package main demonstrates a HTML email cleaner.
sanitise_ugc
Package main demonstrates a simple user generated content sanitizer.
Package main demonstrates a simple user generated content sanitizer.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL