Purpose
The strategies being
adopted to analyze and store the unprecedented volumes of data
being gathered by current and future High Energy and Nuclear Physics (HENP)
experiments is the
coordinated deployment of Grid technologies such as those being developed for
the Particle Physics Data Grid and the
GridPhysics Network. It is anticipated that these technologies will
be deployed at hundreds of institutes
that will
be able to search out and analyze information from an interconnected worldwide
grid of tens of thousands of computers and storage devices. This in turn will
require the ability to sustain over long periods the transfer of
large amounts of data between collaborating
sites with relatively low latency.
The purpose of the IEPM-BW project is to develop and use an infrastructure
to make active end-to-end application and network performance
measurements for high
performance network links such as are used worldwide by Grid applications
and other academic and research (A&R) applications deployed over
high performance network such as
ESnet,
Internet2
and other (A&R) networks in the developed world.
|
Tasks
The following are the major tasks:
- Develop/deploy a simple, robust, ssh based active end-to-end
application and network measurement and management infrastructure.
- Install/integrate a base set of measurement tools into
the infrastructure make regular measurements and record the results.
These tools include:
ping,
traceroute,
iperf,
pipechar,
bbcp and
bbftp.
- Develop data reduction, analysis, reporting, forecasting and archiving tools.
- Compare and validate the various tools and determine the regions
of applicability.
- Install new network (e.g. the
INCITE tools,
pathrate
and pathload)
and application (e.g.
GridFTP)
tools into the infrastructure, and use it to evaluate the performance of the
tools and their relevancy.
- Evaluate new TCP stacks such as
HTCP,
High Speed TCP, and
FAST and compare with the
default stacks.
- Provide access to the data for research, forecasting, validation etc.
|
Novel Ideas
- Provides complementary low impact, overview and more intense, detailed
perfomance measurements:
- The low impact provides network performance measurements to most of
the Internet connected world
providing delays, loss and connectivity information over long
(many years) time periods.
- The higher impact measurements are oriented to
high performance links
(e.g. grid sites, ESnet and Internet 2 connected sites)
and provide
Network AND application high throughput performance
measurements allowing comparisons, identification of bottlenecks etc.
- Uses both passive and active measurements
- Continuous, robust, measurement, analysis and
web based
reporting of results and
data
available world wide.
- Simple infrastructure enabling rapid deployment, locating within an
application host, and local site management to avoid security issues
- Provide simple forecasting for applications and for optimizing the
frequency of measurements.
|
Impact
These measurements will supplement the uses that the
PingER measurements are used for and in particular
will be critical for:
- Providing planning information to applications, grid and network planners by:
- Providing and understanding the achievable performance today in
network throughput and application (file copy & ftp) throughput.
- Providing historical information on growth,
incremental and
sudden changes, and patterns
(e.g.
diurnal)
of changes in performance.
- Providing input on how to improve measurement tools such as
iperf.
- Providing trouble shooting information to networks and users by:
- Indicating when there are
incremental or sudden changes and
the magnitude of the changes, and providing alerts.
- Helping pin-point whether a performance issue is at the network layer
or application layer, or at some sub-component such as a
disk.
- Providing networkers and applications developers, a
better understanding of how networks and applications work together by
providing:
- Validation/correlation of how network performance relates to
delays and loss performance (e.g. bandwidth estimators).
- Assist users in selecting the optimum network (e.g. windows,
streams, QoS) and application (e.g. compression) configuration
options.
- Identifying the critical bottlenecks such as disk, cpu speed, operating
system, network bandwidth etc., for high throughput application
performance.
- Provide a public domain network performance data base,
together with analyses and
navigable reports from active monitoring.
This information can be used for further research, for predictions and
for application steering.
|