Directories
¶
| Path | Synopsis |
|---|---|
|
Package constants provides configuration constants and selectors for the defuddle content extraction system.
|
Package constants provides configuration constants and selectors for the defuddle content extraction system. |
|
Package debug provides debugging functionality for the defuddle content extraction system.
|
Package debug provides debugging functionality for the defuddle content extraction system. |
|
Package elements provides enhanced element processing functionality This module handles code block processing including syntax highlighting, language detection, and code formatting
|
Package elements provides enhanced element processing functionality This module handles code block processing including syntax highlighting, language detection, and code formatting |
|
Package markdown provides HTML to Markdown conversion functionality.
|
Package markdown provides HTML to Markdown conversion functionality. |
|
Package metadata provides functionality for extracting and processing document metadata.
|
Package metadata provides functionality for extracting and processing document metadata. |
|
Package removals provides content-pattern-based removal for the defuddle extraction pipeline.
|
Package removals provides content-pattern-based removal for the defuddle extraction pipeline. |
|
Package scoring provides content scoring functionality for the defuddle content extraction system.
|
Package scoring provides content scoring functionality for the defuddle content extraction system. |
|
Package standardize provides content standardization functionality for the defuddle content extraction system.
|
Package standardize provides content standardization functionality for the defuddle content extraction system. |
|
Package text provides text analysis utilities for content extraction.
|
Package text provides text analysis utilities for content extraction. |
|
Package urlutil provides URL resolution and sanitization for extracted content.
|
Package urlutil provides URL resolution and sanitization for extracted content. |
Click to show internal directories.
Click to hide internal directories.