Has parsed start/end date/time (if it was able to be parsed unambiguously).
Has normalized activity names for easy filtering.
Marks if reservation is likely required for an activity, and has a list of links.
Error messages are included.
Has fields with raw text from the website for validation or as a fallback if parsing fails.
Limitations
Special schedules may overlap with a subset of the dates of the regular schedule, so the start/end dates are only good for determining that a scheduled activity does not apply to a specific date.
Exceptions and notifications are included as raw HTML since they're freeform.
Raw data
Features / Limitations
Only basic facility information and schedule information are scraped. This helps keep the scraper reliable and ensures the schema can be kept stable long-term.
Facility addresses are geocoded using geocodio (which has better results than pelias/geocode.earth and nominatim).
Schedule changes and facility notifications are scraped on a best-effort basis without additional parsing since these fields are inherently free-form. This helps keep the scraper reliable and reduces the likelihood of accidentally missing important information.
Scraped fields have minimal processing. This helps keep the scraper reliable and reduces the likelihood of accidentally missing important information.
Optional fields are available which contain best-effort parsing and normalization of scraped fields (to assist with usage), including:
Normalized schedule group name.
Normalized schedule name (facility and date range stripped).
Raw schedule date range (if stripped from the normalized schedule name).
Parsed schedule date range.
Normalized schedule activity name.
Activity time range and weekday as an integer.
Explicit reservation requirement in activity names as a boolean (typically, this is used as an exception to the default based on whether the schedule group has reservation links).
Overlapping schedules (e.g., holiday schedules) are not merged. These schedules are not consistently formatted as they are manually named and created, so although I attempt to parse time ranges, I don't use them to merge schedules. This helps keep the scraper reliable and reduces the likelihood of accidentally missing important information.
Any potential parsing problems are included in an array of error messages for each facility.
A protobuf schema is used for maintainability, but it may be changed in backwards-incompatible ways if needed.