postgres

package
v0.0.0-...-86cb477 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 25, 2024 License: BSD-3-Clause Imports: 47 Imported by: 0

Documentation

Overview

Package postgres provides functionality for reading and writing to the postgres database.

Index

Constants

This section is empty.

Variables

View Source
var (

	// SearchLatencyDistribution aggregates search request latency by search
	// query type.
	SearchLatencyDistribution = &view.View{
		Name:        "go-discovery/search/latency",
		Measure:     searchLatency,
		Aggregation: ochttp.DefaultLatencyDistribution,
		Description: "Search latency, by result source query type.",
		TagKeys:     []tag.Key{keySearchSource},
	}
	// SearchResponseCount counts search responses by search query type.
	SearchResponseCount = &view.View{
		Name:        "go-discovery/search/count",
		Measure:     searchLatency,
		Aggregation: view.Count(),
		Description: "Search count, by result source query type.",
		TagKeys:     []tag.Key{keySearchSource},
	}
)

Functions

func GeneratePathTokens

func GeneratePathTokens(packagePath string) []string

GeneratePathTokens returns the subPaths and path token parts that will be indexed for search, which includes (1) the packagePath (2) all sub-paths of the packagePath (3) all parts for a path element that is delimited by a dash and (4) all parts of a path element that is delimited by a dot, except for the last element.

func GetFromSearchDocuments

func GetFromSearchDocuments(ctx context.Context, t *testing.T, db *DB, packagePath string) (modulePath, version string, found bool)

GetFromSearchDocuments retrieves the module path and version for the given package path from the search_documents table. If the path is not in the table, the third return value is false.

func GetPathID

func GetPathID(ctx context.Context, ddb *database.DB, path string) (id int, err error)

func GetSymbolHistoryForBuildContext

func GetSymbolHistoryForBuildContext(ctx context.Context, ddb *database.DB, pathID int, modulePath string,
	bc internal.BuildContext) (_ map[string]string, err error)

GetSymbolHistoryForBuildContext returns a map of the first version when a symbol name is added to the API for the specified build context, to the symbol name, to the UnitSymbol struct. The UnitSymbol.Children field will always be empty, as children names are also tracked.

func GetSymbolHistoryFromTable

func GetSymbolHistoryFromTable(ctx context.Context, ddb *database.DB,
	packagePath, modulePath string) (_ *internal.SymbolHistory, err error)

GetSymbolHistoryFromTable returns a SymbolHistory, which is a representation of the first version when a symbol is added to an API. It reads data from the symbol_history table.

func GetSymbolHistoryWithPackageSymbols

func GetSymbolHistoryWithPackageSymbols(ctx context.Context, ddb *database.DB,
	packagePath, modulePath string) (_ *internal.SymbolHistory, err error)

GetSymbolHistoryWithPackageSymbols fetches symbol history data by using data from package_symbols and documentation_symbols, and computed using symbol.IntroducedHistory.

GetSymbolHistoryWithPackageSymbols is exported for use in tests.

func InsertSampleDirectoryTree

func InsertSampleDirectoryTree(ctx context.Context, t *testing.T, testDB *DB)

InsertSampleDirectory tree inserts a set of packages for testing GetUnit and frontend.FetchDirectoryDetails.

func MustInsertModule

func MustInsertModule(ctx context.Context, t *testing.T, db *DB, m *internal.Module)

MustInsertModule inserts m into db, calling t.Fatal on error. It also updates the latest-version information for m.

func MustInsertModuleGoMod

func MustInsertModuleGoMod(ctx context.Context, t *testing.T, db *DB, m *internal.Module, goMod string)

func MustInsertModuleNotLatest

func MustInsertModuleNotLatest(ctx context.Context, t *testing.T, db *DB, m *internal.Module)

func ResetTestDB

func ResetTestDB(db *DB, t *testing.T)

ResetTestDB truncates all data from the given test DB. It should be called after every test that mutates the database.

func RunDBTests

func RunDBTests(dbName string, m *testing.M, testDB **DB)

RunDBTests is a wrapper that runs the given testing suite in a test database named dbName. The given *DB reference will be set to the instantiated test database.

func RunDBTestsInParallel

func RunDBTestsInParallel(dbBaseName string, numDBs int, m *testing.M, acquirep *func(*testing.T) (*DB, func()))

RunDBTestsInParallel sets up numDBs databases, then runs the tests. Before it runs them, it sets acquirep to a function that tests should use to acquire a database. The second return value of the function should be called in a defer statement to release the database. For example:

func Test(t *testing.T) {
    db, release := acquire(t)
    defer release()
}

func SearchDocumentSections

func SearchDocumentSections(synopsis, readmeFilename, readme string) (b, c, d string)

SearchDocumentSections computes the B and C sections of a Postgres search document from a package synopsis and a README. By "B section" and "C section" we mean the portion of the tsvector with weight "B" and "C", respectively.

The B section consists of the synopsis. The C section consists of the first sentence of the README. The D section consists of the remainder of the README. All sections are split into words and processed for replacements. Each section is limited to maxSectionWords words, and in addition the D section is limited to an initial fraction of the README, determined by maxReadmeFraction.

func UpsertSearchDocument

func UpsertSearchDocument(ctx context.Context, ddb *database.DB, args UpsertSearchDocumentArgs) (err error)

UpsertSearchDocument inserts a row in search_documents for the given package. The given module should have already been validated via a call to validateModule.

Types

type DB

type DB struct {
	// contains filtered or unexported fields
}

func New

func New(db *database.DB) *DB

New returns a new postgres DB.

func NewBypassingLicenseCheck

func NewBypassingLicenseCheck(db *database.DB) *DB

NewBypassingLicenseCheck returns a new postgres DB that bypasses license checks. That means all data will be inserted and returned for non-redistributable modules, packages and directories.

func SetupTestDB

func SetupTestDB(dbName string) (_ *DB, err error)

SetupTestDB creates a test database named dbName if it does not already exist, and migrates it to the latest schema from the migrations directory.

func (*DB) CleanAllModuleVersions

func (db *DB) CleanAllModuleVersions(ctx context.Context, modulePath, reason string) (err error)

CleanAllModuleVersions deletes all versions of the given module path from the DB and marks them as cleaned in module_version_states.

func (*DB) CleanModuleVersions

func (db *DB) CleanModuleVersions(ctx context.Context, mvs []internal.Modver, reason string) (err error)

CleanModuleVersions deletes each module version from the DB and marks it as cleaned in module_version_states.

func (*DB) Close

func (db *DB) Close() error

Close closes a DB.

func (*DB) DeleteModule

func (db *DB) DeleteModule(ctx context.Context, modulePath, resolvedVersion string) (err error)

DeleteModule deletes a Version from the database.

func (*DB) DeletePseudoversionsExcept

func (db *DB) DeletePseudoversionsExcept(ctx context.Context, modulePath, resolvedVersion string) (err error)

DeletePseudoversionsExcept deletes all pseudoversions for the module except the provided resolvedVersion.

func (*DB) GetExcludedPrefixes

func (db *DB) GetExcludedPrefixes(ctx context.Context) ([]string, error)

GetExcludedPrefixes reads all the excluded prefixes from the database.

func (*DB) GetImportedBy

func (db *DB) GetImportedBy(ctx context.Context, pkgPath, modulePath string, limit int) (paths []string, err error)

GetImportedBy fetches and returns all of the packages that import the package with path. The returned error may be checked with derrors.IsInvalidArgument to determine if it resulted from an invalid package path or version.

Instead of supporting pagination, this query runs with a limit.

func (*DB) GetImportedByCount

func (db *DB) GetImportedByCount(ctx context.Context, pkgPath, modulePath string) (_ int, err error)

GetImportedByCount returns the number of packages that import pkgPath.

func (*DB) GetLatestInfo

func (db *DB) GetLatestInfo(ctx context.Context, unitPath, modulePath string, latestUnitMeta *internal.UnitMeta) (latest internal.LatestInfo, err error)

GetLatestInfo returns the latest information about the unit in the module. See internal.LatestInfo for documentation about the returned values. If latestUnitMeta is non-nil, it is the result of GetUnitMeta(unitPath, internal.UnknownModulePath, internal.LatestVersion). That can save a redundant call to GetUnitMeta here.

func (*DB) GetLatestMajorPathForV1Path

func (db *DB) GetLatestMajorPathForV1Path(ctx context.Context, v1path string) (_ string, _ int, err error)

GetLatestMajorPathForV1Path reports the latest unit path in the series for the given v1path. It also returns the major version for that path.

func (*DB) GetLatestModuleVersions

func (db *DB) GetLatestModuleVersions(ctx context.Context, modulePath string) (_ *internal.LatestModuleVersions, err error)

GetLatestModuleVersions returns the row of the latest_module_versions table for modulePath. If the module path is not found, it returns nil, nil.

func (*DB) GetModuleInfo

func (db *DB) GetModuleInfo(ctx context.Context, modulePath, resolvedVersion string) (_ *internal.ModuleInfo, err error)

GetModuleInfo fetches a module version from the database with the primary key (module_path, version).

func (*DB) GetModuleReadme

func (db *DB) GetModuleReadme(ctx context.Context, modulePath, resolvedVersion string) (_ *internal.Readme, err error)

GetModuleReadme returns the README corresponding to the modulePath and version.

func (*DB) GetModuleVersionState

func (db *DB) GetModuleVersionState(ctx context.Context, modulePath, resolvedVersion string) (_ *internal.ModuleVersionState, err error)

GetModuleVersionState returns the current module version state for modulePath and version.

func (*DB) GetModuleVersionsToClean

func (db *DB) GetModuleVersionsToClean(ctx context.Context, daysOld, limit int) (modvers []internal.Modver, err error)

GetModuleVersionsToClean returns module versions that can be removed from the database. Only module versions that were updated more than daysOld days ago will be considered. At most limit module versions will be returned.

func (*DB) GetNestedModules

func (db *DB) GetNestedModules(ctx context.Context, modulePath string) (_ []*internal.ModuleInfo, err error)

GetNestedModules returns the latest major version of all nested modules given a modulePath path prefix with or without major version.

func (*DB) GetNextModulesToFetch

func (db *DB) GetNextModulesToFetch(ctx context.Context, limit int) (_ []*internal.ModuleVersionState, err error)

GetNextModulesToFetch returns the next batch of modules that need to be processed. We prioritize modules based on (1) whether it has status zero (never processed), (2) whether it is the latest version, (3) if it is an alternative module, and (4) the number of packages it has. We want to leave time-consuming modules until the end and process them at a slower rate to reduce database load and timeouts. We also want to leave alternative modules towards the end, since these will incur unnecessary deletes otherwise.

func (*DB) GetPackageVersionState

func (db *DB) GetPackageVersionState(ctx context.Context, pkgPath, modulePath, resolvedVersion string) (_ *internal.PackageVersionState, err error)

GetPackageVersionState returns the current package version state for pkgPath, modulePath and version.

func (*DB) GetPackageVersionStatesForModule

func (db *DB) GetPackageVersionStatesForModule(ctx context.Context, modulePath, resolvedVersion string) (_ []*internal.PackageVersionState, err error)

GetPackageVersionStatesForModule returns the current package version states for modulePath and version.

func (*DB) GetPackagesForSearchDocumentUpsert

func (db *DB) GetPackagesForSearchDocumentUpsert(ctx context.Context, before time.Time, limit int) (argsList []UpsertSearchDocumentArgs, err error)

GetPackagesForSearchDocumentUpsert fetches search information for packages in search_documents whose update time is before the given time.

func (*DB) GetRecentFailedVersions

func (db *DB) GetRecentFailedVersions(ctx context.Context, limit int) (_ []*internal.ModuleVersionState, err error)

GetRecentFailedVersions returns versions that have most recently failed.

func (*DB) GetRecentVersions

func (db *DB) GetRecentVersions(ctx context.Context, limit int) (_ []*internal.ModuleVersionState, err error)

GetRecentVersions returns recent versions that have been processed.

func (*DB) GetStdlibPathsWithSuffix

func (db *DB) GetStdlibPathsWithSuffix(ctx context.Context, suffix string) (paths []string, err error)

GetStdlibPathsWithSuffix returns information about all paths in the latest version of the standard library whose last component is suffix. A path that exactly match suffix is not included; the path must end with "/" + suffix.

We are only interested in actual standard library packages: not commands, which we happen to include in the stdlib module, and not directories (paths that do not contain a package).

func (*DB) GetSymbolHistory

func (db *DB) GetSymbolHistory(ctx context.Context, packagePath, modulePath string,
) (_ *internal.SymbolHistory, err error)

GetSymbolHistory returns a SymbolHistory, which is a representation of the first version when a symbol is added to an API.

func (*DB) GetUnit

func (db *DB) GetUnit(ctx context.Context, um *internal.UnitMeta, fields internal.FieldSet, bc internal.BuildContext) (_ *internal.Unit, err error)

GetUnit returns a unit from the database, along with all of the data associated with that unit. If bc is not nil, get only the Documentation that matches it (or nil if none do).

func (*DB) GetUnitMeta

func (db *DB) GetUnitMeta(ctx context.Context, fullPath, requestedModulePath, requestedVersion string) (_ *internal.UnitMeta, err error)

GetUnitMeta returns information about the "best" entity (module, path or directory) with the given path. The module and version arguments provide additional constraints. If the module is unknown, pass internal.UnknownModulePath; if the version is unknown, pass internal.LatestVersion.

The rules for picking the best are as follows.

1. If the version is known but the module path is not, choose the longest module path at that version that contains fullPath.

2. Otherwise, find the latest "good" version (in the modules table) that contains fullPath.

a. First, follow the algorithm of the go command: prefer longer module paths, and
   find the latest unretracted version, using semver but preferring release to pre-release.
b. If no modules have latest-version information, find the latest by sorting the versions
   we do have: again first by module path length, then by version.

func (*DB) GetUserInfo

func (db *DB) GetUserInfo(ctx context.Context, user string) (_ *UserInfo, err error)

GetUserInfo returns information about a database user.

func (*DB) GetVersionMap

func (db *DB) GetVersionMap(ctx context.Context, modulePath, requestedVersion string) (_ *internal.VersionMap, err error)

GetVersionMap fetches a version_map entry corresponding to the given modulePath and requestedVersion.

func (*DB) GetVersionMaps

func (db *DB) GetVersionMaps(ctx context.Context, paths []string, requestedVersion string) (_ []*internal.VersionMap, err error)

GetVersionMaps returns all of the version maps for the provided path and requested version if they are present.

func (*DB) GetVersionStats

func (db *DB) GetVersionStats(ctx context.Context) (_ *VersionStats, err error)

GetVersionStats queries the module_version_states table for aggregate information about the current state of module versions, grouping them by their current status code.

func (*DB) GetVersionsForPath

func (db *DB) GetVersionsForPath(ctx context.Context, path string) (_ []*internal.ModuleInfo, err error)

GetVersionsForPath returns a list of tagged versions sorted in descending semver order if any exist. If none, it returns the 10 most recent from a list of pseudo-versions sorted in descending semver order.

func (*DB) HasGoMod

func (db *DB) HasGoMod(ctx context.Context, modulePath, version string) (has bool, err error)

HasGoMod reports whether a given module version has a go.mod file. It returns a NotFound error if it can't find any information.

func (*DB) InsertExcludedPrefix

func (db *DB) InsertExcludedPrefix(ctx context.Context, prefix, user, reason string) (err error)

InsertExcludedPrefix inserts prefix into the excluded_prefixes table.

For real-time administration (e.g. DOS prevention), use the dbadmin tool. to exclude or unexclude a prefix. If the exclusion is permanent (e.g. a user request), also add the prefix and reason to the excluded.txt file.

func (*DB) InsertIndexVersions

func (db *DB) InsertIndexVersions(ctx context.Context, versions []*internal.IndexVersion) (err error)

InsertIndexVersions inserts new versions into the module_version_states table with a status of zero.

func (*DB) InsertModule

func (db *DB) InsertModule(ctx context.Context, m *internal.Module, lmv *internal.LatestModuleVersions) (isLatest bool, err error)

InsertModule inserts a version into the database using db.saveVersion, along with a search document corresponding to each of its packages. It returns whether the version inserted was the latest for the given module path.

func (*DB) InsertNewModuleVersionFromFrontendFetch

func (db *DB) InsertNewModuleVersionFromFrontendFetch(ctx context.Context, modulePath, resolvedVersion string) (err error)

InsertNewModuleVersionFromFrontendFetch inserts a new module version into the module_version_states table with a status of zero that was requested from frontend fetch.

func (*DB) IsExcluded

func (db *DB) IsExcluded(ctx context.Context, path string) (_ bool, err error)

IsExcluded reports whether the path matches the excluded list. A path matches an entry on the excluded list if it equals the entry, or is a component-wise suffix of the entry. So path "bad/ness" matches entries "bad" and "bad/", but path "badness" matches neither of those.

func (*DB) LatestIndexTimestamp

func (db *DB) LatestIndexTimestamp(ctx context.Context) (_ time.Time, err error)

LatestIndexTimestamp returns the last timestamp successfully inserted into the module_version_states table.

func (*DB) NumUnprocessedModules

func (db *DB) NumUnprocessedModules(ctx context.Context) (total, new int, err error)

NumUnprocessedModules returns the number of modules that need to be processed.

func (*DB) ReconcileSearch

func (db *DB) ReconcileSearch(ctx context.Context, modulePath, version string, status int) (err error)

ReconcileSearch reconciles the search data for modulePath. If the module is alternative or has no good versions, it removes search data. Otherwise, if the latest good version doesn't match the version in search_documents, and the module path is not a prefix of one already in search_documents, it inserts the latest good version into search_documents and imports_unique. The version and status arguments should come from the module currently being fetched. They are used to determine if the module is alternative.

func (*DB) Search

func (db *DB) Search(ctx context.Context, q string, opts SearchOptions) (_ []*SearchResult, err error)

Search executes two search requests concurrently:

  • a sequential scan of packages in descending order of popularity.
  • all packages ("deep" search) using an inverted index to filter to search terms.

The sequential scan takes significantly less time when searching for very common terms (e.g. "errors", "cloud", or "kubernetes"), due to its ability to exit early once the requested page of search results is provably complete.

Because 0 <= ts_rank() <= 1, we know that the highest score of any unscanned package is ln(e+N), where N is imported_by_count of the package we are currently considering. Therefore if the lowest scoring result of popular search is greater than ln(e+N), we know that we haven't missed any results and can return the search result immediately, cancelling other searches.

On the other hand, if the popular search is slow, it is likely that the search term is infrequent, and deep search will be fast due to our inverted gin index on search tokens.

The gap in this optimization is search terms that are very frequent, but rarely relevant: "int" or "package", for example. In these cases we'll pay the penalty of a deep search that scans nearly every package.

func (*DB) SearchSupport

func (db *DB) SearchSupport() internal.SearchSupport

SearchSupport implements the DataSource interface, supporting all search types.

func (*DB) StalenessTimestamp

func (db *DB) StalenessTimestamp(ctx context.Context) (time.Time, error)

StalenessTimestamp returns the index timestamp of the oldest module that is newer than the index timestamp of the youngest module we have processed. That is, let T be the maximum index timestamp of all processed modules. Then this function return the minimum index timestamp of unprocessed modules that is no less than T, or an error that wraps derrors.NotFound if there is none.

The name of the function is imprecise: there may be an older unprocessed module, if one newer than it has been processed.

We use this function to compute a metric that is a lower bound on the time it takes to process a module since it appeared in the index.

func (*DB) Underlying

func (db *DB) Underlying() *database.DB

Underlying returns the *database.DB inside db.

func (*DB) UpdateLatestGoodVersion

func (db *DB) UpdateLatestGoodVersion(ctx context.Context, modulePath string) error

UpdateLatestGoodVersion updates the latest version of modulePath.

func (*DB) UpdateLatestModuleVersions

func (db *DB) UpdateLatestModuleVersions(ctx context.Context, vNew *internal.LatestModuleVersions) (_ *internal.LatestModuleVersions, err error)

UpdateLatestModuleVersions upserts its argument into the latest_module_versions table if the row doesn't exist, or the new version is later. It returns the version that is in the DB when it completes.

func (*DB) UpdateLatestModuleVersionsStatus

func (db *DB) UpdateLatestModuleVersionsStatus(ctx context.Context, modulePath string, newStatus int) (err error)

UpdateLatestModuleVersionsStatus updates or inserts a failure status into the latest_module_versions table. It only updates the table if it doesn't have valid information for the module path.

func (*DB) UpdateModuleVersionState

func (db *DB) UpdateModuleVersionState(ctx context.Context, mvs *ModuleVersionStateForUpdate) (err error)

UpdateModuleVersionState inserts or updates the module_version_state table with the results of a fetch operation for a given module version.

func (*DB) UpdateModuleVersionStatesForReprocessing

func (db *DB) UpdateModuleVersionStatesForReprocessing(ctx context.Context, appVersion string) (err error)

UpdateModuleVersionStatesForReprocessing marks modules to be reprocessed that were processed prior to the provided appVersion.

func (*DB) UpdateModuleVersionStatesForReprocessingLatestOnly

func (db *DB) UpdateModuleVersionStatesForReprocessingLatestOnly(ctx context.Context, appVersion string) (err error)

UpdateModuleVersionStatesForReprocessingLatestOnly marks modules to be reprocessed that were processed prior to the provided appVersion.

func (*DB) UpdateModuleVersionStatesForReprocessingReleaseVersionsOnly

func (db *DB) UpdateModuleVersionStatesForReprocessingReleaseVersionsOnly(ctx context.Context, appVersion string) (err error)

UpdateModuleVersionStatesForReprocessingReleaseVersionsOnly marks modules to be reprocessed that were processed prior to the provided appVersion.

func (*DB) UpdateModuleVersionStatesForReprocessingSearchDocumentsOnly

func (db *DB) UpdateModuleVersionStatesForReprocessingSearchDocumentsOnly(ctx context.Context, appVersion string) (err error)

UpdateModuleVersionStatesForReprocessingSearchDocumentsOnly marks modules to be reprocessed that are in the search_documents table.

func (*DB) UpdateModuleVersionStatesWithStatus

func (db *DB) UpdateModuleVersionStatesWithStatus(ctx context.Context, status int, appVersion string) (err error)

func (*DB) UpdateModuleVersionStatus

func (db *DB) UpdateModuleVersionStatus(ctx context.Context, modulePath, version string, status int, error string) (err error)

UpdateModuleVersionStatus updates the status and error fields of a module version.

func (*DB) UpdateSearchDocumentsImportedByCount

func (db *DB) UpdateSearchDocumentsImportedByCount(ctx context.Context) (nUpdated int64, err error)

UpdateSearchDocumentsImportedByCount updates imported_by_count and imported_by_count_updated_at.

It does so by completely recalculating the imported-by counts from the imports_unique table.

UpdateSearchDocumentsImportedByCount returns the number of rows updated.

func (*DB) UpdateSearchDocumentsImportedByCountWithCounts

func (db *DB) UpdateSearchDocumentsImportedByCountWithCounts(ctx context.Context, counts map[string]int) (nUpdated int64, err error)

func (*DB) UpsertSearchDocumentWithImportedByCount

func (db *DB) UpsertSearchDocumentWithImportedByCount(ctx context.Context, args UpsertSearchDocumentArgs, importedByCount int) (err error)

UpsertSearchDocumentWithImportedByCount is the same as UpsertSearchDocument, except it also updates the imported by count. This is only used for testing.

func (*DB) UpsertVersionMap

func (db *DB) UpsertVersionMap(ctx context.Context, vm *internal.VersionMap) (err error)

UpsertVersionMap inserts a version_map entry into the database.

type ModuleVersionStateForUpdate

type ModuleVersionStateForUpdate struct {
	ModulePath           string
	Version              string
	AppVersion           string
	Timestamp            time.Time
	Status               int
	HasGoMod             bool
	GoModPath            string
	FetchErr             error
	PackageVersionStates []*internal.PackageVersionState
}

type SearchOptions

type SearchOptions = internal.SearchOptions

type SearchResult

type SearchResult = internal.SearchResult

type UpsertSearchDocumentArgs

type UpsertSearchDocumentArgs struct {
	PackagePath    string
	ModulePath     string
	Version        string
	Synopsis       string
	ReadmeFilePath string
	ReadmeContents string
}

type UserInfo

type UserInfo struct {
	User       string
	NumTotal   int // number of processes running as that user
	NumWaiting int // number of that user's processes waiting for locks
}

UserInfo holds information about a DB user.

type VersionStats

type VersionStats struct {
	LatestTimestamp time.Time
	VersionCounts   map[int]int // from status to number of rows
}

VersionStats holds statistics extracted from the module_version_states table.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL