Log Management and Analytics

Explore the full capabilities of Log Management and Analytics powered by SolarWinds Loggly

View Product Info

FEATURES

Infrastructure Monitoring Powered by SolarWinds AppOptics

Instant visibility into servers, virtual hosts, and containerized environments

View Infrastructure Monitoring Info

Application Performance Monitoring Powered by SolarWinds AppOptics

Comprehensive, full-stack visibility, and troubleshooting

View Application Performance Monitoring Info

Digital Experience Monitoring Powered by SolarWinds Pingdom

Make your websites faster and more reliable with easy-to-use web performance and digital experience monitoring

View Digital Experience Monitoring Info

Troubleshooting with Windows Logs

Ultimate Guide to Logging - Your open-source resource for understanding, analyzing, and troubleshooting system logs

Troubleshooting with Windows Logs

The most common reason people look at Windows logs is to troubleshoot a problem with their systems or applications.

This article presents common troubleshooting use cases for security, crashes, and failed services. Examples demonstrate diagnosing the root cause of the problem using the events in your logs. Remember to check warnings and errors proceeding a critical event to see the bigger picture.

Security Log Events

The Security log includes security-related events, especially those related to authentication and access. These logs are your best place to search for unauthorized access attempts to your system.

The following events are of particular value in the Security log:

Successfully Logged On

These events include all successful logon attempts to a system. Each event includes categories of information:

  • Log details – log name, source, severity, event ID, and other log information.
  • Subject – account name, domain, and security information about the login.
  • Logon information – type is the method used to log on, such as using the local or remote keyboard (over the network). This field value is expressed as an integer, the most common being 2 (local keyboard) and 3 (network). Additional details about the logon are also available.
  • Impersonation Level – how much authority is given to the server when it is impersonating the client.
  • New Logon – name, domain, and other details for the new logon for the account that was logged on.
  • Process Information – name and ID of the originating process.
  • Network Information – name, IP address, and port where the remote logon request. originated. These values are left blank for local logins, or if the information can’t be found.
  • Detailed Authentication Information – details about this specific logon request.

For an explanation of all possible fields, search for your log’s event ID. For example, successful login attempts have an event ID of 4624, which are described here. This example shows a successful login event generated on the accessed system when a logon session is created.

Log Name:      Security

Source:        Microsoft-Windows-Security-Auditing

Date:          6/26/2019 4:32:47 AM

Event ID:      4624

Task Category: Logon

Level:         Information

Keywords:      Audit Success

User:          N/A

Computer:      EC2AMAZ-ES915Q9

Description:

An account was successfully logged on.

 

Subject:

Security ID:         NULL SID

Account Name:        -

Account Domain:      -

Logon ID:       0x0

 

Logon Information:

Logon Type:          3

Restricted Admin Mode:     -

Virtual Account:     No

Elevated Token:      Yes

 

Impersonation Level:       Impersonation

 

New Logon:

Security ID:         EC2AMAZ-ES915Q9\Administrator

Account Name:        Administrator

Account Domain:      EC2AMAZ-ES915Q9

Logon ID:       0x2CA1ED0

Linked Logon ID:     0x0

Network Account Name: -

Network Account Domain:    -

Logon GUID:          {00000000-0000-0000-0000-000000000000}

 

Process Information:

Process ID:          0x0

Process Name:        -

 

Network Information:

Workstation Name:     EC2AMAZ-ES915Q9

Source Network Address:    -

Source Port:         -

 

Detailed Authentication Information:

Logon Process:       NtLmSsp

Authentication Package:    NTLM

Transited Services:   -

Package Name (NTLM only):  NTLM V2

Key Length:          128

Failed to Log On

Check Windows Security logs for failed logon attempts and unfamiliar access patterns. Authentication failures occur when a person or application passes incorrect or otherwise invalid logon credentials. Failed logins have an event ID of 4625.
These events show all failed attempts to log on to a system. This could be due to someone trying to hack into a system. However, it could also mean someone forgot his or her password, the account had expired, or an application was configured with the wrong password. These events include the following pieces of information.

  • Log details – name, source, and other log information.
  • Subject – account name, domain, and security information about the logon.
  • Logon type – method used to log on, such as using the local or remote keyboard (over the network). This field value is expressed as an integer, the most common being 2 (local keyboard) and 3 (network).
  • Account for Which Logon Failed – name, domain, and other details for the failed logon.
  • Failure Information – failure reason and status of the attempt.
  • Process Information – name and ID of the originating process.
  • Network Information – name, IP address, and port where the remote logon request originated. These values are left blank for local logins, or if the information can’t be found.
  • Detailed Authentication Information – details about this specific logon request.

To learn more, you can read a description of all the fields of this event.

Here is an example of an unsuccessful logon attempt generated by the accessed system when the attempt failed:

Log Name:      Security

Source:        Microsoft-Windows-Security-Auditing

Date:          6/26/2019 4:42:52 AM

Event ID:      4625

Task Category: Logon

Level:         Information

Keywords:      Audit Failure

User:          N/A

Computer:      EC2AMAZ-ES915Q9

Description:

An account failed to log on.

 

Subject:

Security ID:         NULL SID

Account Name:        -

Account Domain:      -

Logon ID:       0x0

 

Logon Type:               3

 

Account For Which Logon Failed:

Security ID:         NULL SID

Account Name:        ADMINISTRATOR

Account Domain:

 

Failure Information:

Failure Reason:      Unknown user name or bad password.

Status:              0xC000006D

Sub Status:          0xC000006A

 

Process Information:

Caller Process ID:    0x0

Caller Process Name:  -

 

Network Information:

Workstation Name:     -

Source Network Address:    80.64.102.19

Source Port:         0

 

Detailed Authentication Information:

Logon Process:       NtLmSsp

Authentication Package:    NTLM

Transited Services:   -

Package Name (NTLM only):  -

Key Length:          0

Application Failed to Log On

Well-written applications also log authentication failure events. Here’s an example of a failed logon attempt in SQL Server. It includes information about who attempted to log on and why the attempt failed.

Log Name:      Application

Source:        MSSQLSERVER

Date:          6/25/2019 4:42:52 AM

Event ID:      18456

Task Category: Logon

Level:         Information

Keywords:      Classic,Audit Failure

User:          N/A

Computer:      PSQ-Serv-1

Description:

logon failed for user 'sa'. Reason: Password did not match that for the logon provided. [CLIENT: <local machine>]

Special Privileges Assigned

The Security log captures events when an account has been granted elevated privileges. Several different event IDs correspond to privilege assignment events, but event ID 4672 is for special privilege assignments. In this example, a user has been granted Local Administrator privilege. The details show the new privilege, who granted it, and the group where the account was added.

You can write custom scripts to filter these events for security audit reporting. You can also create a custom view to view these events.

These events include all successful logons by users with administrator privileges. The information includes items such as:

  • Log details – name, source, and other log information
  • Subject – account name, domain, and security information about the logon
  • Member – security ID and account name added
  • Group – security ID, group name and domain added
  • Privileges – list of all privileges assigned to the user

To learn more, you can read a description of all the fields of this log event.

Here’s another example of an elevated permissions event.

Log Name:      Security

Source:        Microsoft-Windows-Security-Auditing

Date:          6/26/2019 10:09:19 PM

Event ID:      4672

Task Category: Special Logon

Level:         Information

Keywords:      Audit Success

User:          N/A

Computer:      EC2AMAZ-ES915Q9

Description:

Special privileges assigned to new logon.

 

Subject:

Security ID:         EC2AMAZ-ES915Q9\Administrator

Account Name:        Administrator

Account Domain:      EC2AMAZ-ES915Q9

Logon ID:       0x38599B0

 

Privileges:          SeSecurityPrivilege

SeTakeOwnershipPrivilege

SeLoadDriverPrivilege

SeBackupPrivilege

SeRestorePrivilege

SeDebugPrivilege

SeSystemEnvironmentPrivilege

SeImpersonatePrivilege

SeDelegateSessionUserImpersonatePrivilege

Why Did My Server or Application Crash?

If you’re investigating why your server or application crashed, the Event log is a great place to start looking. The Application and System logs can tell you when and why a crash occurred. For example, it can give you a clue if this was due to a system or application problem.

Almost all critical errors generate more than one event log entry. There are usually a number of previous warnings or errors prior to the final critical error. When troubleshooting, be sure to look at the messages preceding the crash error.

To find these events, filter your log data for a particular application name, then by critical or error events, and finally sort them by date. These are three of the most common events when troubleshooting a crash:

  • Unexpected Reboot
  • Application Hang
  • Application Fault

Unexpected Reboot

An unexpected reboot error appears in the log when the system fails to shut down and restart gracefully. A likely cause of this error is the operating system stopped responding and crashed, or the server lost power. Look for events preceding the reboot to see a possible root cause.

Here is an excerpt from an unexpected reboot event:

Log Name:      System

Source:        Microsoft-Windows-Kernel-Power

Date:          25-06-2019 01:13:56

Event ID:      41

Task Category: (63)

Level:         Critical

Keywords:      (2)

User:          SYSTEM

Computer:      PSQ-Serv-1

Description:

The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

Application Hang

An application hang error appears in the Event log when a program running in your server stops responding. In this case, your server’s hardware and the OS were functioning properly, but the application was either stuck in a loop or waiting for a resource that was not available.

This example shows how an application stopped responding to Windows and Windows shut it down.

Log Name:      Application

Source:        Application Hang

Date:          6/19/2019 8:31:53 PM

Event ID:      1002

Task Category: (101)

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      WIN-AOTBQV71KQP

Description:

The program YourPhone.exe version 1.19053.13.0 stopped interacting with Windows and was closed. To see if more information about the problem is available, check the problem history in the Security and Maintenance control panel.

 

Process ID: 1428

Start Time: 01d529e6311325cf

Termination Time: 4294967295

Application Path: C:\Program Files\WindowsApps\Microsoft.YourPhone_1.19053.13.0_x64__8wekyb3d8bbwe\YourPhone.exe

Report Id: 454e89c9-d311-4a7e-8650-da9016851456

Faulting package full name: Microsoft.YourPhone_1.19053.13.0_x64__8wekyb3d8bbwe

Faulting package-relative application ID: App

Hang type: Quiesce

Application Fault

An application fault error appears in Event log when a program running in your server encounters a critical error. This error is generally a bug in the application code or an issue with memory running out. Here’s an example of a faulty DLL for svchost.exe.

Log Name:   Application

Source:     Application Error

Date:       6/24/2019 8:35:59 AM

Event ID:   1000

Task Category: (100)

Level:      Error

Keywords:   Classic

User:       N/A

Computer:   WIN-AOTBQV71KQP

Description:

Faulting application name: svchost.exe_SysMain, version: 10.0.17763.1, time stamp: 0xb900eeff

Faulting module name: sysmain.dll, version: 10.0.17763.503, time stamp: 0x572b556e

Exception code: 0xc0000005

Fault offset: 0x000000000004a21c

Faulting process id: 0x798

Faulting application start time: 0x01d529e58e7ac36d

Faulting application path: C:\WINDOWS\system32\svchost.exe

Faulting module path: c:\windows\system32\sysmain.dll

Report Id: 1d7447f6-a0bb-40e8-8905-47e79dff220e

Faulting package full name:

Faulting package-relative application ID:

Finding the Root Cause of a Failed Service

A Windows service is a special kind of application that runs in the background and has its own Windows session. People often want to know why a particular service did not start or run successfully.

You can find service failures in the Application log by filtering on Service Control Manager source and then filtering for critical or error events. Here are common examples of failed service events.

  • Service Failed to Start
  • Service Timeout
  • Windows Update Failure
  • Scheduled Task Delayed or Failed

Service Failed to Start

This error is logged when a service fails to start normally. In this example, the Group Policy Client did not start in a timely fashion. The event and its message indicate when the problem happened. Check the preceding messages to track down the root cause.

Log Name:      System

Source:        Service Control Manager

Date:          06-21-2019 10:49:27

Event ID:      7000

Task Category: None

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      PSQ-Serv-1

Description:

The Group Policy Client service failed to start due to the following error:

The service did not respond to the start or control request in a timely fashion.

Service Timeout

A service timeout error appears when a service does not start within the expected period of time (default is three seconds). Normally services are designed to start quickly and run continuously to spread out processing load. This error could be due to the service waiting for a resource that was not available. This example shows an event generated from the Windows Error Reporting Service.

Log Name:      System

Source:        Service Control Manager

Date:          6/23/2019 11:04:00 AM

Event ID:      7009

Task Category: None

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      PSQ-Serv-1

Description:

A timeout was reached (30000 milliseconds) while waiting for the Windows Error Reporting Service service to connect.

Windows Update Failure

One system administration task is to watch if computers in the network are failing to get Windows updates.

The Windows Server Update Service (WSUS) is a patch management tool that automatically downloads and applies patches and security updates for Microsoft products from the Microsoft website. In most production installations, administrators want some sort of control over what patches are applied and when they get applied. This is to avoid unexpected behavior like automatic reboots or applications breaking after a patch cycle. In many organizations, a centralized WSUS server is used to download all patches, and administrators then schedule their distribution. The status of a Windows Update run is important to monitor.

In this example, a Microsoft update failed to install, and has generated an error code (0x80240017) we can look up for more information.

Log Name:      System

Source:        Microsoft-Windows-WindowsUpdateClient

Date:          5/31/2019 1:13:14 PM

Event ID:      20

Task Category: Windows Update Agent

Level:         Error

Keywords:      Failure,Installation

User:          SYSTEM

Computer:      nymph

Description:

Installation Failure: Windows failed to install the following update with error 0x80240017: Definition Update for Windows Defender Antivirus - KB2267602 (Definition 1.293.2654.0).

Scheduled Task Delayed or Failed

Another service people often watch is the Windows Task Scheduler. It’s similar to the Linux cron daemon. You can schedule and run programs, scripts, or commands on a recurring basis. Tasks can be scheduled for specific times or run in response to a trigger. For example, a task could be running a PowerShell backup script every night or copying files to an FTP server once every week.

The events generated from the Windows Task Scheduler can help confirm if your tasks are running according to the triggers and schedules you defined or if they are failing to launch. The Task Scheduler window has its own event viewer. Here is an example event from the log.

Log Name:      Microsoft-Windows-TaskScheduler/Operational

Source:        TaskScheduler

Logged:        6/26/2019 5:03:19 AM

Event ID:      201

Task Category: Action Completed

Level:         Information

Keywords:

User:          SYSTEM

Computer:      EC2AMAZ-ES915Q9

OpCode:        (2)

Description:

Task Scheduler successfully completed task \GoogleUpdateTaskMachineUA" , instance "{57a63bbe-3b27-4367-8f45-8853e7c306a5}" , action "C:\Program Files (x86)\Google\Update\GoogleUpdate.exe" with return code 0.

What other troubleshooting use cases do you run into? Add your comments and let us know!