vuApp360 Dashboards
vuApp360 Dashboards, pre-packaged with vuApp360, can be accessed by navigating to the Dashboards UI from the left navigation menu, Dashboards. On the Dashboards landing page, App Catalog can be accessed with the help of Search bar. The App Catalog dashboard is stored under the vuApp360 folder.
After navigating to the vuApp360 dashboard, the landing page will be the App Catalog page as shown below:
vuApp360 offers a variety of screens to provide you with valuable insights into your application's performance. Here are the available dashboards:
- Application Overview Dashboards: Provides a high-level summary of your application's performance metrics and key indicators.
- App Catalog: Offers insights into the various applications within your system, allowing for comprehensive management and analysis.
- Service Catalog: Displays a catalog of services utilized by your application, facilitating effective service management and optimization.
- Metrics Overview Dashboards: Presents an overview of performance metrics, enabling quick assessment and monitoring of key parameters.
- Error Rate Breakdown: Breaks down error rates by type, source, or other factors, aiding in the identification and resolution of errors.
- Latency Breakdown: Analyzes latency patterns across different components of your application, helping to pinpoint areas for optimization.
- Service Overview Dashboards: Provides a detailed overview of individual services, including performance metrics and usage statistics.
- Service Summary: Summarizes key performance indicators and metrics for each service, offering insights for performance optimization.
- Service Map: Visualizes the relationships and dependencies between different services in your application, enhancing system understanding and troubleshooting.
- API Transaction: Tracks and analyzes API transactions, facilitating performance optimization and ensuring efficient API usage.
- Analyze Errors: Offers insights to analyze and troubleshoot errors, aiding in error resolution and prevention.
- Analyze Latency: Provides tools to analyze and diagnose latency issues, helping to improve system responsiveness and performance.
- Trace Map: Visualizes trace data in an interactive map format, providing a detailed view of application performance and behavior.
In the following sections, we will delve into each screen in detail, exploring their functionalities and how they empower you to optimize your application's performance and deliver an exceptional user experience.
Application Overview
The Application overview dashboards in vuApp360 empower IT Ops, Application Owners, and Business Heads to gain comprehensive insights into their application landscape. It allows to view all the applications in the environment, understand their criticality, and observe their performance metrics in real time. With the App Catalog and Service Catalog options, you can quickly identify problematic applications/ services, predict performance changes, and efficiently troubleshoot issues. This section will guide you through the functionalities and benefits of the App Catalog and Service Catalog.
App Catalog
The App Catalog offers a bird's-eye view of all the applications in the environment, along with essential performance metrics and status information.
This is the default landing page after navigating to the vuApp360 dashboard, shown below:
Let's explore the key features that make the App Catalog a valuable tool for application observability:
- Overview of Apps: With just a glance, you can access a comprehensive list of all applications. The easy-to-navigate console ensures quick access to critical information and provides info on RED metrics along with Total requests. For each app it displays crucial performance metrics - Request Rate, Error Rate, and P90 Latency (90th percentile latency).
- Criticality-Based Ordering: To prioritize troubleshooting efforts, the App Catalog presents apps in decreasing order of their criticality. Critical apps indicate that they require immediate attention. On the other hand, apps with no issues assure you of their optimal performance.
- Navigating to Service Catalog Page: To access the service catalog page, simply click the Application name. This action will seamlessly direct you to the service catalog, automatically applying the filters associated with the chosen application. For an in-depth understanding of this page's functionalities, refer to the comprehensive discussion in the Service Catalog section of this guide.
- Navigating to Error Rate Breakdown Page: Easily navigate to the Error Rate Breakdown Page by clicking on the relevant error metric showcased on the Application Catalog page. This user-friendly approach will instantly navigate you to the Error Rate Breakdown Dashboard, where you can gain insights into error occurrences within the selected application. For a thorough understanding of the capabilities offered by this page, refer to the detailed Error Rate Breakdown section in this guide.
- Navigating to Latency Breakdown Page: Effortlessly access the Latency Breakdown Page by selecting the latency metric associated with your desired application from the Application Catalog page. With a simple click, you'll be directed to the Latency Breakdown Page, where you can analyze latency patterns for the chosen application. To explore the extensive features of this page, refer to the detailed Latency Breakdown section in this user guide.
By leveraging the App Catalog, users gain valuable insights into the applications' health, prioritize troubleshooting, and predict potential issues, ensuring a seamless user experience and smooth application performance.
Service Catalog
The Service Catalog presents services in a structured tabular format, showcasing key performance metrics.
- Gain a consolidated view of all services within an application or all applications, simplifying observability and management.
- Effortlessly drill down into specific services for more detailed performance analysis and efficient troubleshooting.
To access the Service Catalog page, you can navigate to the second tab on the Application Overview Dashboard.
Here are the main functionalities of the Service Catalog:
- Overview of Services: Instantly access a comprehensive list of all services running in a specific application. The tabular layout displays essential details, including the service names, Request Rate, Error Rate, and P90 Latency.
- Service Summary Dashboard: By clicking on a specific service, users are seamlessly redirected to the dedicated service summary dashboard. This feature enables you to drill down into individual service details, providing a more in-depth analysis of its performance. The details of the Service Summary dashboard are discussed in further sections.
- Filter by Application: Effortlessly filter the Service catalog to focus on a specific application and view the services involved in that particular application. This feature aids in isolating performance issues and understanding the specific application.
When accessing the Service Catalog tab, be aware that by default, all services for all applications are displayed. However, if you navigate by clicking on a specific arrow button on the App Catalog page, that particular application will be selected, and the filter will be pre-applied.
Metrics Overview
The Metrics Overview dashboard within vuApp360 dashboards serves as a powerful tool for IT Ops, Application Owners, and Business Heads. It provides in-depth insights into application errors, latency performance, and allows for the precise identification of error sources within the environment. This comprehensive view enables stakeholders to address potential performance bottlenecks in the application ecosystem. To navigate to more granular details, such as Error Rate Breakdown and Latency Breakdown, users can seamlessly transition from the App Catalog page by clicking on the respective metrics. In the upcoming sections, we will explore each of these tabs, uncovering their functionalities in detail.
Error Rate Breakdown
Upon entering the Error Rate Breakdown tab, users are displayed a comprehensive landing page facilitating precise analysis.
The interactive interface allows users to select the desired Application, tailoring the analysis to specific parameters and enhancing the focus on relevant data points.
- The visual representation of the error rate percentage is provided against Hosts, Services, and API Transactions, enabling users to monitor and assess application health effectively.
- Failed API Transactions: Presented in tabular format, this section showcases Failed API Transactions with columns such as API Transaction, and its statistics on Total API Transactions, Failed API Transactions, and Failure %.
Interactivity is enhanced as users can click on API Transactions, leading them to the API Transaction tab on the Service Overview Dashboard. - Failed Services: Displayed in a tabular format, this section highlights key metrics of Failed Services like Host, Total API Transactions, Failed, and Failure %.
Users can conveniently navigate to the corresponding Service Summary tab by clicking on entries under the Service column. - Host with Failed API Transactions: This table provides insights into specific hosts by presenting columns like Host, Total API Transactions, Failed API Transactions, and Failure %.
- Failed Traces: In tabular format, this section includes details such as Trace ID, API Transactions, Host, Application, Service, P90 Latency, HTTP Status, and Status.
Users can navigate seamlessly to the Trace Map Dashboard and the API Transaction tab on the Service Overview dashboard by clicking on Trace ID and API Transactions.
Latency Breakdown
Upon entering the Latency Breakdown tab, users are displayed a comprehensive landing page facilitating precise analysis.
The interactive interface allows users to select the desired Application, tailoring the analysis to specific parameters and enhancing the focus on relevant data points.
- The visual representation of the latency is provided against Hosts, Services, and API Transactions, enabling users to monitor and assess application health effectively.
- Host with Slow API Transactions: Presented in tabular format, this section showcases hosts with slow API Transactions with columns such as Host, Total API Transactions, Number of Slow API Transactions, Slow %, and P90 Latency.
- Services with MAX Latency: Displayed in a tabular format, this section highlights key metrics like Service, Total API Transactions, Number of Slow API Transactions, and P90 Latency.
Users can conveniently navigate to the corresponding Service Summary tab by clicking on entries under the Service column. - Slow API Transactions: This table provides insights into specific slow API Transactions by presenting columns like API Transactions, Total API Transactions, Number of Slow API Transactions, Slow %, and P90 Latency.
API Transactions is clickable elements navigating to the API Transaction tab on Service Overview Dashboard with pre-applied filters for this API transaction. - Slow Traces: In tabular format, this section includes details such as Trace ID, API Transactions, Host, Application, Service, P90 Latency, HTTP Status, and Status.
Users can navigate seamlessly to the Trace Map Dashboard and the API Transaction tab on the Service Overview dashboard by clicking on Trace ID and API Transactions respectively.
Service Overview
The Service Overview in vuApp360 presents a multifaceted approach to analyzing and optimizing your application's performance. The five dedicated tabs—Service Summary, Service Map, API Transaction, Analyze Errors, and Analyze Latency—offer a comprehensive suite of tools for IT Ops, Application Owners, and Business Heads.
- Service Summary provides a consolidated view of individual services, offering insight cards with RED metrics and historical comparisons. Visualizations of Request Rate, Error Rate, and Latency empower users to identify trends, ensuring proactive decision-making.
- Service Map introduces an interactive graphical representation of service interactions, including 3rd party services. It provides real-time insights into service status, alerts, and latency, enhancing incident response and facilitating a detailed analysis of individual service performance.
- API Transaction delves into the specifics of transactional performance, visualizing RED metrics over time. With the ability to analyze latency distribution, slow traces, and failed traces, this tab provides a nuanced understanding of API transactions.
- Analyze Errors offers a detailed analysis of errors, including occurrences over time, error messages, and insights into failed hosts, API transactions, and traces. This tab streamlines the identification and resolution of errors for improved application reliability.
- Analyze Latency focuses on latency distribution, slow API transactions, and traces. Visualizations and tables empower users to pinpoint and address latency issues, ensuring optimal application performance.
With these five tabs, Service Overview serves as a centralized hub for proactive monitoring, efficient issue resolution, and comprehensive insights into your application's health.
Service Summary
The Service Summary empowers you to observe and analyze the performance of individual services and their dependencies effectively. The Service Summary offers a consolidated view of your selected service within the application.
You can select the Service from the dropdown that is under consideration for analysis.
- On the Service Summary dashboard, you will find the insight cards displaying RED metrics.
- Request Rate, Error Rate, and Latency Graphs: These graphs offer pictorial representations of the respective metrics for the selected service. Visualizing these metrics enables you to spot trends and anomalies, facilitating proactive decision-making.
In the Latency graph, by default, P90 Latency is shown. You can visualize P50, P75, P95, or P99 as well. To update the visualization click on the respective legend, as required.
- API Transactions: All the API transactions related to this service are listed in this table. Their corresponding Request Rate, Error Rate and Latency (P90) are listed in this table. Additionally, Success % is also listed.
The entries under the API Transaction column are clickable, navigating you to the API Transaction tab.
- All Hosts: A tabular column displaying all the hosts and their corresponding RED Metrics and success % are displayed.
- Latency Distribution By Host (Top 10): The latency for the particular service is visualized against the Hosts in a bar chart.
- Dependencies: A tabular column displaying all the dependencies and their corresponding RED Metrics and Success % are displayed.
- Time Spent By Dependency (Top 10): The latency for the particular service is visualized against the Dependencies in a bar chart.
- Slow Operations: The slow operations for this service are listed in the table with their Start Time, Duration (in ms), HTTP Status Code, and Timestamp.
Service Map
The Graphical View introduces an interactive Service Map, visually representing how various services interact with each other for a particular application. This graphical representation helps you understand service dependencies and interactions. The Service Map includes the following features:
- Service Interaction Visualization: The Service Map offers a clear view of service interactions, illustrating how services interact with each other and how multiple services collaborate to process a request. Directional arrows facilitate the comprehension of the interrelation between services by indicating the flow of communication from one service to another.
- Inclusion of 3rd Party Services: The Service Map encompasses 3rd party services like DB, Kafka, Redis, and others. You can observe their status and observe their interactions with application services, facilitating holistic performance observability.
The current release of product has certain limitations with showcasing DB components in Service map. Also repositioning of nodes in the service map is not possible.
- Service Status and Alerts: The color of the service in the Service Map turns Amber or Red based on its status, P90 latency, and error rate. This visual cue alerts you to critical issues that require immediate attention, enhancing incident response. If nodes/ links are in green color, if no issues are observed.
- Detailed Service Insights: By clicking on any service, users can see RED metrics for that particular service, including its status. This feature streamlines the analysis of individual service performance.
- Latency Visualization: The Service Map visually represents the latency between services using links connecting them. The color of these links changes based on latency SLAs or active alerts, providing quick insights into potential performance bottlenecks.
- You can zoom in, zoom out, fit to screen the service map with options available on the right side.
- The automatic Service Map can handle up to 100 services interacting with each other.
- Data can load in near real-time for up to 100,000 events per second (EPS).
API Transaction
The third tab on the Service Overview Dashboard visualizes the API Transactions.
On the top, you can select the API Transaction to be observed in the dashboard. By default, All will be selected. Whereas, if you are navigated to this tab, by clicking on a particular API Transaction, that particular API Transaction will be pre-applied. Additionally, you can also update filters for Application and Service if required.
- On the top of this dashboard, visualizations of RED metrics against time are shown for the selected API Transaction.
- Latency Distribution: Number of Traces against their latency is statistically visualized in this graph.
You can select a particular range with click & drag to be analyzed in detail on any visualization. This will apply the selected range as a time range filter.
- Slow Traces: The slow traces are listed in this table with columns, Trace ID, API Transaction, Application, Host, Latency, and Status.
The Trace ID is clickable, navigating you to the Trace Map Dashboards for deeper analysis.
- Failed Traces: The failed Traces are listed in this table with columns, Trace ID, API Transaction, Application, Host, and P90 Latency.
The Trace ID is clickable, navigating you to the Trace Map Dashboards for deeper analysis.
Analyze Errors
- The statistics of Server Errors (5xx), Client Errors (4xx), and Exceptions are displayed on the console.
- Error Occurrences: The number of error occurrences against the timestamp is visualized in the bar chart.
- Error Messages for Exceptions: A tabular format displaying Error Type, Error Message, Count, and Last Seen.
- Failed Hosts: For the failed hosts, Host Name, and their RED metrics are listed in the tabular format.
- Failed API Transactions: For the failed API Transactions, Request Rate, and Error % are listed. The API transactions are clickable to navigate to the API transaction tab.
- Failed Traces: This table is useful to analyze the failed traces displaying, Trace ID, API Transaction, Application, Host, and P90 Latency. The Trace ID can be clicked to do the deeper analysis with the Trace Map Dashboard.
Analyze Latency
- Latency Distribution: Number of Traces against their latency is statistically visualized in this graph.
- Slow API Transactions: For the slow API transactions, API transaction name, Request Rate, and P90 Latency are listed in the tabular columns. The API transactions are clickable navigating to the API Transaction tab.
- Slow Traces: For the slow Traces, Trace ID, API Transaction, Application, Host, P90 Latency, and Status are listed in the tabular columns. The Trace ID is clickable providing navigation to the Trace Map Dashboard.
- Latency Visualizations: The statistical bar chart for Latency Distribution by Host (Top 10), Time Spent by Dependency (Top 10), and Time Spent by API Transaction (Top 10) are visualized.
Trace Map
The Trace Map presents an organized, hierarchical view of a trace's progression across various services and components, capturing key steps in the trace and their associated metrics.
Key Components of the Trace Map:
The following fields provide in-depth insights into each span in the Trace Map:
- Span Name: The Span Name represents individual operations or actions within a trace. Each span corresponds to a specific step in processing a request, such as a function call, database query, or HTTP request. These spans are arranged hierarchically to show the sequence and structure of the request flow.
- Service: Service identifies the application or microservice responsible for executing a particular operation captured by a span. It provides insight into which part of your system is handling the specific span operation.
- Component: Component refers to the technology, framework, or library used within a service to execute the operation captured by the span. It offers visibility into the specific tool or mechanism that powers the operation.
- Destination: Destination indicates the target system, service, or resource with which the span interacts. It highlights where a request is directed during the span’s operation, such as a database, external API, or another microservice.
- Start Time: Start Time shows when the span’s execution began, providing a timestamp for each step of the trace. This helps to construct a timeline, enabling you to analyze the sequence and timing of operations.
- Execution Time (Exec(ms)): This represents the total execution time for the span, including any dependent child spans. It shows how long the operation and its sub-operations took to complete.
- Self Execution (Self Exec % and Self Exec(ms)): Self Exec % is the percentage of the total trace execution time spent solely on the span itself, excluding time taken by nested or dependent operations. This focuses on the span's own execution time. Self Exec(ms) measures the time the span took to execute without counting the time taken by any child or dependent spans.
- HTTP Code: The HTTP Code represents the status code returned for spans involving HTTP requests or responses. It indicates the success or failure of an HTTP operation. Common HTTP codes include:
- 200-299: Success (indicated in green)
- 300-399: Redirection (indicated in orange)
- 400+: Errors (indicated in red)
- Span Status: Span Status represents the overall outcome of an operation captured by the span. It reflects whether the operation completed successfully or encountered an issue during execution. Note that the Span Status reflects the application's internal logic, while the HTTP status code adheres to standard client-server protocols. Even if the HTTP code indicates an error (e.g., 400 or 500), the span status may still indicate success if the error was handled gracefully by the application.