Hi folks, this post follows my previous post about capacity planning and it provides you with the tools (and a vmware image ready to run) for your to implement it.

Would like to start with a huge thank you message to Miguel Rodriguez (Alfresco Support Engineer ). He is the creator of this monitoring solution and also the person responsible for setting up the vmware image with all the tools, scripts, etc. My hero list just got bigger, Miguel got a place just after Spider Man and the Silver Surfer

Monitoring Alfresco with OpenSource tools

Monitoring your Alfresco architecture is a know best practice. It allows you to track and store all relevant system metrics and events that can help on:

Troubleshooting possible problems
Verify system Heath
Check user behavior
Build a robust historical data-warehouse to later analysis and capacity planning

This posts explains a typical monitoring scenario over an Alfresco deployment, using only opensource tools.

I’m proposing a fully opensource stack of monitoring tools that build the global monitoring solution. The solution will make use of the following opensource products.

ElasticSearch [http://www.elasticsearch.org/]
Logstash [http://logstash.net/]
Redis [http://logstash.net/docs/1.2.1/outputs/redis]
Kibana [http://www.elasticsearch.org/overview/kibana/]
Graphite (Grafana) [http://grafana.org/]
JavaMelody [https://code.google.com/p/javamelody/]
Icinga [http://www.icinga.org/]

The solution will be monitoring all layers of the application, producing valuable data on all critical aspects of the infrastructure. This will allow a pro-active system administration opposed to a reactive way of facing possible problems by predicting the problems before they happen and take the necessary measures to maintain a healthy system on all layers.

I see this approach as as both a monitoring and capacity planning system allowing to provide “near” real time information updates, customize reporting and to provide custom search mechanism over the collected data.

The diagram below shows how the different components of the solution integrate. Note that we centralize data from all nodes and the various layers of the application in a single location.

The sample architecture being monitored consists on a cluster of two Alfresco/Sharenodes for serving user requests and two Alfresco/Solr nodes for indexing/searching content.

Consider 3 major components of the monitoring solution

Logstash file tailing to monitor Alfresco log files and logstash command execution to monitor specific components i.e. processes, memory, disk,java stack traces, etc.
JavaMelody to monitor applications running in a JVM and other system resources.
Icinga to send jmx requests to Alfresco servers.

Dedicated Monitoring Server Download

All software components of the monitoring server are installed Vmware image that we offer for free (within the opensource spirit :)).

You can download your copy of this monitoring server in http://eu.dl.alfresco.com.s3.amazonaws.com/release/Support/AlfrescoMonitoringVirtualServer/v1.0/AlfrescoMonitoringVirtualServer-1.0.tar

The ElasticSearch server that will collect all the logs from the various components of the application and will host the graphical user interfaces (Kibana and Grafana) to view the monitoring data.

About JavaMelody

JavaMelody is used to monitor Java or Java EE application servers in QA and production environments. It is a tool to measure and calculate statistics on real operation of an application depending on the usage of the application by users. Very easy to integrate in most applications and is lightweight with mostly no impact to target systems.

This tool is mainly based on statistics of requests and on evolution charts, for that reason it’s one important add on to our benchmarking project, as it allow us to see in real time the evolution charts of the most important aspects of our application.

It includes summary charts showing the evolution over time of the following indicators:

Number of executions, mean execution times and percentage of errors of http requests, sql requests, jsp pages or methods of business façades (if EJB3, Spring or Guice)
Java memory
Java CPU
Number of user sessions
Number of jdbc connections

These charts can be viewed on the current day, week, month, year or custom period.

You can have detailed information about javamelody at https://code.google.com/p/javamelody/

Installing JavaMelody

Its really easy to attach javamelody monitor to all alfresco applications (alfresco.war or share.war) and every other web-application that is deployed on your application server.

Step 1

Configure the JavaMelody monitorization on Alfresco tomcat by copying the itextpdf-5.5.2.jar, javamelody.jar and jrobin-1.5.9.1 to the tomcat shared libfolder under <tomcat_install_dir>\shared\lib or your application server (if not tomcat) global classloader location.

Step 2

Edit the global tomcat web.xml (D:\alfresco\tomcat\conf\web.xml) file to enable javamelody monitorization on every application. Add the following filter :

<filter>
<filter-name>monitoring</filter-name>
<filter-class>net.bull.javamelody.MonitoringFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>monitoring</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
<listener>
<listener-class>net.bull.javamelody.SessionListener</listener-class>
</listener>

And that’s about it, after restarting you can access the monitorization of every application in http://<your_host>:<server_port>/<web-app-context>/monitoring, for example http://localhost:8080/alfresco/monitoring

Monitoring Stages Breakdown

Stage 1 – Data Capturing(Logstash)

We capture monitoring data using different procedures.

Scheduled Jobs (Db queries, Alfresco jmx Beans queries, OS level commands)
Logs indexing with logstash. We use logstash it to collect logs, parse them, and send them to ElasticSearch to be stored them for later use (like, for searching)
The Alfresco audit log (when configured) is also parsed and indexed by elastic search, proving all the enabled audit statistics.
Metrics with JavaMelody

Stage 2 – Monitoring Data Archiving(ElasticSearch)

On the diagram above we can see the flow of data capturing using logstash and Elastic Search. Let’s see some details on each of the boxes on the diagram.

Logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and send them to ElasticSearch to be stored them for later use (like, for searching)

Redis is a logs data broker, receiving data from log “shippers” and sending the data to a log “indexer”

ElasticSearch is a distributable, RESTful, free Lucene powered search engine/server

Kibana3 is a tool for displaying and interacting with your data.

Stage 3 – Trending and Analysis (Kibana,Grafana)

To analyze the data and the trends we use install 2 different GUIs on the monitorization server (Kibana and Grafana).

Kibana allows us to check the indexed logs with metadata, and to troubleshoot on specific log traces. It provides a very robust search mechanism on top of the elasticsearch indexes. It provides strategic technical insights with an global overview on all layers of the platform delivering actionable insights in real-time from almost any type of structured and unstructured data source.

On the flow above we can see how the information and statistics get to Grafana.

Grafana is a beautiful dashboard for displaying various Graphite metrics through a web browser. It has enormous potential, it’s easy to setup and to customize for different business needs.

Let’s have a closer look on the remaining components on the flow diagram.

Statsd is a network daemon that listens for statistics, like counters and timers sent over UDP and sends them to Carbon.

Carbon accepts metrics over various protocols and caches them in RAM as they are received, flushing them to disk on an interval using the underlying whisper library.

Whisper provides fast, reliable storage of numeric data over time

Grafana is an easy to use and feature rich Graphite dashboard

Stage 4 – Monitoring

We use scheduled commands and index data on elastisearch checking the following monitoring information from the Alfresco and Solr servers.

JVM Memory Usage
Server Memory
Alfresco Cpu utilization
Overall Server Cpu utilization
Solr Indexing Information
Number of documents on Alfresco “live” store
Number of documents on Alfresco “archive” store
Number of concurrent users on Alfresco repository
Alfresco Database pool occupation
Number of active sessions on Alfresco Share
Number of active sessions on Alfresco Workdesk
Number of busy tomcat threads
Number of current tomcat threads
Number of maximum tomcat threads

Those can be extended at any time, performing monitorization on any target relevant on your use case.

Stage 5 – Troubleshooting

While troubleshooting we use Kibana/Grafana and JavaMelody.

Kibana allow’s us to check the “indexed” logs with meta-data and verify exactly what classes are related with the problem as well as the number of occurrences and root of the exceptions.

Grafana show us what/how/when the server resources are being affected by the problem.

JavaMelody provides detailed information on crucial sections of the application. The goal of JavaMelody is to monitor Java or Java EE application servers in QA and production environments.

It produces graphs for Memory, CPU, HTTP Sessions, Threads, GC, JDBC Connections, SQL Hits, Open Files, Disk Space, Network I/O, Statistics for HTTP traffic, Statistics for SQL queries, Thread dumps, JMX Beans information and overall System Information. Java Melody has a Web interface to report on data statistics.

Using these 3 tools, troubleshooting a possible problem becomes an friendly task and it boosts the speed of the investigations, that normally would take ages to gather all the necessary information to get to the root cause of the issue.

Stage 6 – Notification and Reporting

We use Icinga in order to notify the delegated alfresco administrator (email) when there is some problem with the Alfresco system. Icinga is an enterprise grade open source monitoring system that keeps watch over networks and resources, notifies the user of errors and recoveries and generates performance data for reporting.

Icinga Web is highly dynamic and laid out as a dashboard with tabs which allow the user to flip between different views that they need at any one time

Stage 7 – Sizing Adjustments

Sizing will be a human action on the capacity and monitoring solution. Performing a regular analysis to the monitorization/capacity planning data, we will know exactly when and how we need to scale our architecture.

The more data gets inside elastic search along the application life cycle, the more accurate are the capacity predictions because they represent the “real” application usage during the defined period.

This represents a very important role when modeling and sizing the architecture for the future business requirements.

7.1 – Peak Period Methodology

The Peak period Methodology is the most efficient way to implement a capacity planning strategy as it allows to analyze vital performance information when the system is under more load/stress. On its genesis the peak period methodology collects and analyzes data during a configurable peak period. This allows the application to estimate the number of CPU’s, Memory and cluster nodes on different layers of the application required to support a given expected load.

The peak period may be an hour, a day, 15 minutes or any other period that is used to analyze the collected utilization statistics. Assumptions may be estimated based on business requirements or specific benchmarks of a similar implementation.

Your monitoring Targets on a Alfresco installation

I’ve identified the following targets to be candidates to participate on the Monitoring system and have their data indexed and stored on elastic search.

Database

Transactions
Number of Connections
Slow Queries
Query Plans
Critical DM database queries ( # documents of each mime type, … )
Database server earth ( cpu, memory, IO, Network)
Database statistics integration
Database sizing statistics ( growth, etc)
Peak Period

Application Servers (Tomcats)

Request and response times
Access Logs ( number of concurrent requests, number concurrent users , etc)
Cpu
Io
Memory
Disk Space Usage
Peak period
Longest Request
Threads ( Concurrent Threads, Busy Threads )

Application JVM

Jvm Settings Analysis
GC Analysis
Log Analysis (Errors, Exceptions, Warnings, Class Segmentations(Authorization, Permissions, Authentication)
Auditing Enabling and Analysis (Logins, Reads, Writes, Changed Permissions, Workflows Running, Workflows States)
Caches Monitoring (Caches usage, invalidation, cache sizes )
Protocol Analysis (FTP, CMIS; Sharepoint, WEBDAV, IMAP, CIFS )
Architecture analysis

Search Subsystem(Solr)

Jmx Beans Monitorization
Caches ( Configuration, Utilization, Tuning, Inserts, Evictions and Hits )
Indexes Health
Jvm Settings Analysis
Jvm Health Analysis
Garbage collection Analysis
Query Debug (Response times, query analysis, slow queries, Peak periods)
Search and Index Memory Usage

Network

Input/Output
High availability
Tcp Errors / Network errors at Network protocol level
Security Analysis ( Ports open, Firewalls, network topology , proxies, encryption )

Shared File Systems

Networking to clients hosts
Storage Type ( SAN, NAS )
I/O

Clustering

Cluster members subscription analysis (subscription analysis)
Cluster cache invalidation strategy and shared caches performance
Cluster load balancing algorithm performance (cluster nodes load distribution)

The Alfresco Audit Trail

The monitoring solution also uses and indexes the Alfresco audit trail log, when audit is enabled. Alfresco audit should be used with caution as auditing too many events may have a negative impact on performance.

Alfresco has the option of enabling and configuring an audit trail log. It stores specific user actions (configurable) on a dedicated log file (audit trail).

Building on the auditing architecture the data producer org.alfresco.repo.audit.access.AccessAuditor gathers together lower events into user recognizable events. For example the download or preview of content are recorded as a single read. Similarly the upload of a new version of a document is recorded as a single create version. By contrast the AuditMethodInterceptor data producer typically would record multiple events.

A default audit configuration file located at <alfresco.war>/WEB-INF/classes/alfresco/audit/alfresco-audit-access.xml is provided that persists audit data for general use. This may be enhanced to extract additional data of interest to specific installations. For ease of use, login success, login failure and logout events are also persisted by the default configuration.

Default audit filter settings are also provided for the AccessAuditor data producer, so that internal events are not reported. These settings may be customized (by setting global properties) to include or exclude auditing of specific areas of the repository, users or some other value included in the audit data created by AccessAuditor.

No additional functionality is provided for the retrieve of persisted audit data, as all data is stored in the standard way, so is accessible via the AuditService search, audit web scripts, database queries and Alfresco Explorer show_audit.ftl preview.

Detailed information on Audit possibilities available at:

And that’s about it folks, i hope you liked this article and that it can help you on monitoring your projects. More articles with relevant information from the field are coming up, so stay tuned.

All the best, One Love,

Luis

Monitoring your Alfresco solution

Monitoring Stages Breakdown

Stage 1 – Data Capturing(Logstash)

Stage 2 – Monitoring Data Archiving(ElasticSearch)

Stage 3 – Trending and Analysis (Kibana,Grafana)

Stage 4 – Monitoring

Stage 5 – Troubleshooting

Stage 6 – Notification and Reporting

Stage 7 – Sizing Adjustments

7.1 – Peak Period Methodology

Your monitoring Targets on a Alfresco installation

Database

Application Servers (Tomcats)

Application JVM

Search Subsystem(Solr)

Network

Shared File Systems

Clustering

The Alfresco Audit Trail

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112