Documentation
¶
Overview ¶
Package kobodict implements reading, writing, and other utilities for Kobo dictionaries (v2).
A marisa implementation must be provided by github.com/pgaskin/kobodict/marisa or a custom one if Writer or Reader is used.
Index ¶
- Constants
- Variables
- func NormalizeWordReference(w string, variant bool) string
- func Pack(w *Writer, dir string) error
- func Unpack(r *Reader, dir string) error
- func WordPrefix(word string) string
- type Crypter
- type Decrypter
- type Encrypter
- type MarisaReader
- type MarisaWriter
- type Reader
- type ReaderDicthtml
- type ReaderFile
- type Writer
Constants ¶
const CryptMethodAES string = "aes"
CryptMethodAES represents AES-128-ECB encryption with PKCS#7 padding.
Variables ¶
var Marisa interface { MarisaReader MarisaWriter }
Marisa is used by Reader and Writer for reading/writing Marisa tries. It is automatically set on supported platforms if github.com/pgaskin/dictutil/kobodict/marisa is imported, but can be overridden manually.
Functions ¶
func NormalizeWordReference ¶
NormalizeWordReference normalizes a word for use in an dicthtml headword (<a name="...") or variant (<variant name="..."). It matches the way Kobo finds words in a file.
The logic is reversed from DictionaryParser::htmlForWord in libnickel.
Note: Headwords are prefix-matched against the query, the uppercased query, the lowercased query, and the lowercased query with the first letter uppercased. Variants are only prefix-matched against the lowercased query.
Note: The matching is only done in the file matching the prefix for the query.
func Pack ¶
Pack is a helper function to pack the contents a folder unpacked using Unpack into a Writer. It is assumed that the writer has not been used. The provided file will be overwritten if it exists and is a regular file, or created if it doesn't exist. Pack will not close the writer.
func Unpack ¶
Unpack is a helper function to unpack the contents of a Reader to a folder on-disk. The provided dir must be non-existent. Unpack will not close the reader.
func WordPrefix ¶
WordPrefix gets the prefix of a word for sharding dicthtml files.
This is not to be used with Kanji, as those are handled by a separate function for Japanese dictionaries.
WordPrefix is a simplification of the logic reversed from DictionaryParser::htmlForWord (see wordPrefix), but with performance and cleaner code. It is should have the exact same results.
Types ¶
type Decrypter ¶
type Decrypter interface { // Decrypt decrypts the dicthtml bytes. It will only be called if the // dicthtml is not otherwise readable. An error should be returned if the // decryption itself encounters an error; the decryptor should not try to // judge if the resulting bytes are valid. Decrypt([]byte) ([]byte, error) }
Decrypter decrypts dicthtml files.
type Encrypter ¶
type Encrypter interface { // Encrypt encrypts the provided bytes. Encrypt([]byte) ([]byte, error) }
Encrypter encrypts dicthtml files.
type MarisaReader ¶
MarisaReader represents a simplified abstraction for reading Marisa tries.
type MarisaWriter ¶
MarisaWriter represents a simplified abstraction for writing Marisa tries.
type Reader ¶
type Reader struct { Word []string Dicthtml []*ReaderDicthtml File []*ReaderFile // contains filtered or unexported fields }
Reader provides access to the contents of a dictzip file.
func NewReader ¶
NewReader returns a new dictzip reader which reads from r, with the given file size.
func (*Reader) SetDecrypter ¶
SetDecrypter sets the Decrypter used to decrypt encrypted dicthtml files.
type ReaderDicthtml ¶
ReaderDicthtml represents a dicthtml file from a Reader.
func (*ReaderDicthtml) Open ¶
func (f *ReaderDicthtml) Open() (io.ReadCloser, error)
Open returns an io.ReadCloser which reads the decoded dicthtml file. Multiple files can be read at once.
type ReaderFile ¶
type ReaderFile struct { Name string // contains filtered or unexported fields }
ReaderDicthtml represents a raw file from a Reader (e.g. images).
func (*ReaderFile) Open ¶
func (f *ReaderFile) Open() (io.ReadCloser, error)
Open returns an io.ReadCloser which reads the contents of the file. Multiple files can be read at once.
type Writer ¶
type Writer struct {
// contains filtered or unexported fields
}
Writer creates dictzips. It does not do any validation; it only does what it is told. It is up to the user to ensure the input is valid.
func (*Writer) AddWord ¶
AddWord normalizes and adds a word to the index. If the word has already been added, it does nothing.
func (*Writer) Close ¶
Close writes the marisa index and the zip footer. The error should not be ignored. It does not close the underlying writer.
func (*Writer) CreateDicthtml ¶
CreateDicthtml adds a dicthtml file for the specified prefix and returns a writer which is valid until the next file is created.
func (*Writer) CreateFile ¶
CreateFile adds a raw file with the specified name. Note that Kobo only supports GIF and JPEG files starting with the "GIF" and "JFIF" magic, and the treatment of other files is undefined. In addition, subdirectories are not supported. The behaviour is undefined if a dicthtml file is added this way.
func (*Writer) Exists ¶
Exists checks if a file already exists in the dictzip with the specified name.
func (*Writer) SetEncrypter ¶
SetEncrypter sets the Encrypter used to encrypt dicthtml files. This must be will only apply to dicthtml files added after the encrypter is set.