Quantcast
Channel: Tech Spear » Linux
Viewing all articles
Browse latest Browse all 10

Systemd Journal – An alternate for the syslog

$
0
0

There was a recent Google document which my friend shared with me. It was quite interesting that it was saying about an alternate for the syslog, which the Redhat developers has put forward. The two RH developers Kay Sievers and Lennart Poettering brought many new inventions in Unix applications and now they combined systemd with the hournal which is explained as follows.

I will just copy paste the important points being mentioned.

“Syslog has been around for ~30 years, due to its simplicity and ubiquitousness it is an invaluable tool for administrators. However, the number of limitations are substantial, and over time they have started to be serious problems:

 

  1. The message data is generally not authenticated, every local process can claim to be Apache under PID 4711, and syslog will believe that and store it on disk.
  2. The data logged is very free-form. Automated log-analyzers need to parse human language strings to a) identify message types, and b) parse parameters from them. This results in regex horrors, and a steady need to play catch-up with upstream developers who might tweak the human language log strings in new versions of their software. Effectively, in a away, in order not to break user-applied regular expressions all log messages become ABI of the software generating them, which is usually not intended by the developer.
  3. The timestamps generally do not carry timezone information, even though some newer specifications define support for it.
  4. Syslog is only one of many log systems on local machines. Separate logs are kept for utmp/wtmp, lastlog, audit, kernel logs, firmware logs, and a multitude of application-specific log formats. This is not only unnecessarily complex, but also hides the relation between the log entries in the various subsystems.
  5. Reading log files is simple but very inefficient. Many key log operations have a complexity of O(n). Indexing is generally not available.
  6. The syslog network protocol is very simple, but also very limited. Since it generally supports only a push transfer model, and does not employ store-and-forward, problems such as Thundering Herd or packet loss severely hamper its use.
  7. Log files are easily manipulable by attackers, providing easy ways to hide attack information from the administrator
  8. Access control is non-existent. Unless manually scripted by the administrator a user either gets full access to the log files, or no access at all.
  9. The meta data stored for log entries is limited, and lacking key bits of information, such as service name, audit session or monotonic timestamps.
  10. Automatic rotation of log files is available, but less than ideal in most implementations: instead of watching disk usage continuously to enforce disk usage limits rotation is only attempted in fixed time intervals, thus leaving the door open to many DoS attacks.
  11. Rate limiting is available in some implementations, however, generally does not take the disk usage or service assignment into account, which is highly advisable.
  12. Compression in the log structure on disk is generally available but usually only as effect of rotation and has a negative effect on the already bad complexity behaviour of many key log operations.
  13. Classic Syslog traditionally is not useful to handle early boot or late shutdown logging, even though recent improvements (for example in systemd) made this work.
  14. Binary data cannot be logged, which in some cases is essential (Examples: ATA SMART blobs or SCSI sense data, firmware dumps)

 

Many of these issues have become very visible in the recent past. For example, the recent, much discussed kernel.org intrusion involved log file manipulation which was only detected by chance. In addition, due to the limitations of syslog, at this point in time users frequently have to rely on closed source components to make sense of the gathered logging data, and make access to it efficient.

Logging is a crucial part of service management. On Unix, most running services connect to syslog to write log messages. In systemd, we built logging into the very core of service management: since Fedora 16 all services started are automatically connected to syslog with their standard output and error output. Regardless whether a service is started at early boot or during normal operation, its output ends up in the system logs. Logging is hence something so central, that it requires configuration to avoid it, and is turned from opt-in to opt-out. The net effect is a much more transparent, debuggable and auditable system. Transparency is no longer just an option for the knowledgeable, but the default.

 

During the development of systemd the limitations of syslog became more and more apparent to us. For example: one very important feature we want to add to ease the administrator’s work is showing the last 10 lines (or so) of log output of a service next to the general service information shown by “systemctl status foo.service”. Implementing this correctly for classic syslog is prohibitively inefficient, unreliable and insecure: a linear search through all log files (which might involve decompressing them on-the-fly) is required, and the data stored might be manipulated, and cannot easily (and without races) be mapped to the systemd service name and runtime.

 

To reduce all this to a few words: traditional syslog, after its long history of ~30 years has grown into a very powerful tool which suffers by a number of severe limitations.”

 

With those reasons being mentioned, they told that they have came up with a new deamon, which will give solution for all those existing problems and will have more features.

Systemd Journal Advantages:

  1. Simplicity: little code with few dependencies and minimal waste through abstraction.
  2. Zero Maintenance: logging is crucial functionality to debug and monitor systems, as such it should not be a problem source of its own, and work as well as it can even in dire circumstances. For example, that means the system needs to react gracefully to problems such as limited disk space or /var not being available, and avoid triggering disk space problems on its own (e.g. by implementing journal file rotation right in the daemon at the time a journal file is extended).
  3. Robustness: data files generated by the journal should be directly accessible to administrators and be useful when copied to different hosts with tools like “scp” or “rsync”. Incomplete copies should be processed gracefully. Journal file browsing clients should work without the journal daemon being around.
  4. Portable: journal files should be usable across the full range of Linux systems, regardless which CPU or endianess is used. Journal files generated on an embedded ARM system should be viewable on an x86 desktop, as if it had been generated locally.
  5. Performance: journal operations for appending and browsing should be fast in terms of complexity. O(log n) or better is highly advisable, in order to provide for organization-wide log monitoring with good performance
  6. Integration: the journal should be closely integrated with the rest of the system, so that logging is so basic for a service, that it would need to opt-out of it in order to avoid it. Logging is a core responsibility for a service manager, and it should be integrated with it reflecting that.
  7. Minimal Footprint: journal data files should be small in disk size, especially in the light that the amount of data generated might be substantially bigger than on classic syslog.
  8. General Purpose Event Storage: the journal should be useful to store any kind of journal entry, regardless of its format, it’s meta data or size.
  9. Unification: the numerous different logging technologies should be unified so that all loggable events end up in the same data store, so that global context of the journal entries is stored and available later. e.g. a firmware entry is often followed by a kernel entry, and ultimately a userspace entry. It is key that the relation between the three is not lost when stored on disk.
  10. Base for Higher Level Tools: the journal should provide a generally useful API which can be used by health monitors, recovery tools, crash report generators and other higher level tools to access the logged journal data.
  11. Scalability: the same way as Linux scales from embedded machines to super computers and clusters, the journal should scale, too. Logging is key when developing embedded devices, and also essential at the other end of the spectrum, for maintaining clusters. The journal needs to focus on generalizing the common use patterns while catering for the specific differences, and staying minimal in footprint.
  12. Universality: as a basic building block of the OS the journal should be universal enough and extensible to cater for application-specific needs. The format needs to be extensible, and APIs need to be available.
  13. Clustering & Network: Today computers seldom work in isolation. It is crucial that logging caters for that and journal files and utilities are from the ground on developed to support big multi-host installations.
  14. Security: Journal files should be authenticated to make undetected manipulation impossible.

You can get the original document from this link.

This is the overview of the new Systemd Journal. You can use this application along with the syslog or use it as an alternate. But this wont satisfy all the basic advantages of syslog. So, give it a try and suggest what you think to the developers.

 


Viewing all articles
Browse latest Browse all 10

Trending Articles