
Granular Data Gatherer (gdg)
Collects Granular OS Metrics for Troubleshooting
Report Bug
·
Request Feature
Table of Contents
-
About The Project
-
Getting Started
- Technical Details
- Usage
- Build It Yourself
- Validated Distributions
- Roadmap
- Contributing
- License
- Reference
About The Project
gdg or Granular Data Gatherer was developed in Go to fill the missing gap in the availability of an easy and open all-in-one tool to collect OS metrics for troubleshooting. OSWatcher and nmon cannot be the only viable options.
Getting Started
To get a local copy up and running follow these simple steps.
Prerequisites
- a server, instance, VM running a systemd-enabled Linux distribution
Installation
Download the binary from Releases (https://github.com/rfparedes/gdg/releases/latest/download/gdg) to /usr/local/sbin on the server and run:
sudo chmod +x /usr/local/sbin/gdg
Start it
sudo /usr/local/sbin/gdg -start
Check Status Anytime
/usr/local/sbin/gdg -status
Technical Details
-
There are three components to gdg, each which can be separately started or stopped
- granular data collection using standard utilities
- rtmon collection of network state information
- process d-state detect and automated sysrq-t
-
gdg uses standard Linux utilities to perform its work, including:
- iostat
- top
- mpstat
- vmstat
- ss
- nstat
- ps
- nfsiostat
- ethtool
- ip
- pidstat
- rtmon
-
gdg will detect which utilities are available and only use those installed. In advance, you can install any of the utilities above anytime before or after setup. Most of these utilities are located in only five different packages. On most distributions, sysstat package contains (iostat, mpstat, pidstat), nfs-common or nfs-client package contains (nfsiostat), procps package contains (top, vmstat, ps), iproute2 package contains (ss, nstat, ip, rtmon) and ethtool contains (ethtool).
-
gdg will create a configuration file in /etc/gdg.cfg and a data directory in /var/log/gdg-data.
-
gdg uses a systemd timer so there is no running daemon.
-
gdg installs a systemd service and systemd timer on -start.
-
gdg removes the systemd service and systemd timer on -stop. All other files are untouched.
-
gdg collects data in the /var/log/gdg-data directory. The children below this directory are named after the utility (e.g. iostat) which collected the data. Below this directory are .dat (e.g. meminfo_21.03.07.2300.dat) files named after the following format (utility_YY.MM.DD.HH00.dat). The .dat files contain at maximum, one hour worth of data.
-
To easily search down chronologically through the data collected in the .dat file, use the search string zzz.
-
rtmon logging needs to be enabled explicitly and will collect network state information directly from the kernel on an ongoing basis. Enabling this enables a systemd service which is running while rtmon is enabled. This can be used to prove that service issues started after an external network failure. [1]
-
If d-state is enabled, during each interval run, the number of processes in D state are detected and if this number is greater than or equal to a user-defined value (number of processes in D state), echo t > /proc/sysrq-trigger is executed to get a task trace of all processes. This is a one-time action, meaning, once task trace is triggered, it won't be triggered again until user enables again explictly.
Usage
To start collection in 30s intervals, run
sudo /usr/local/sbin/gdg -t 30 -start
To stop collection, run
sudo /usr/local/sbin/gdg -stop
To see the data collected
cd /var/log/gdg-data
To see the current status of gdg including start/stop status, version, interval, data location, and current size of collected data, run
/usr/local/sbin/gdg -status
e.g.
~~~~~~~~~~~~~~~
gdg status
~~~~~~~~~~~~~~~
VERSION: gdg-0.9.0
STATUS: started
RTMON: started
INTERVAL: 30s
DATA LOCATION: /var/log/gdg-data/
CONFIG LOCATION: /etc/gdg.cfg
CURRENT DATA SIZE: 33MB
~~~~~~~~~~~~~~~
DSTATE: stopped
NUMPROCS: 0
If you want to change the interval (-t) or after installing additional supported utilities, run
sudo /usr/local/sbin/gdg -reload -t 60
To toggle rtmon logging on or off, run
sudo /usr/local/sbin/gdg -rtmon
To enable d-state functionality to trigger sysrq-t
sudo /usr/local/sbin/gdg -d <NUMPROCS>
For help
/usr/local/sbin/gdg -h
Build it yourself
- You'll need a go compiler installed
Clone it
git clone https://github.com/rfparedes/gdg.git
Build it
cd gdg
go build -o gdg
Move it
mv gdg /usr/local/sbin
sudo chmod +x /usr/local/sbin/gdg
Start it
sudo /usr/local/sbin/gdg -start
Validated Distributions
gdg has been validated on:
- SLE-12 (SLES or SLES-SAP 12 all SPs)
- SLE-15 (SLES or SLES-SAP 15 all SPs)
- openSUSE Leap 12/15
- Debian 9
- Debian 10
- RHEL7
- RHEL8
Roadmap
See the open issues for a list of proposed features (and known issues).
Contributing
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature)
- Commit your Changes (
git commit -m 'Add some AmazingFeature')
- Push to the Branch (
git push origin feature/AmazingFeature)
- Open a Pull Request
License
Distributed under the GPL-3.0 License. See LICENSE for more information.
Reference
[1] https://www.suse.com/support/kb/doc/?id=000019863