rvbd.shark
The Shark package offers a set of interfaces to control and work with a Cascade Shark Appliance. The functionality in the module includes:
This documentation assumes you are already familiar with the Riverbed Shark Appliance, specifically concepts like Capture Jobs and Views. If you are not already familiar with these concepts, see the introduction to the Shark architecture and/or the Shark manual.
The primary interface to the Shark-related flyscript functionality
is the class rvbd.shark.Shark
.
An instance of this object represents a connection to a Shark server,
and can be used to examine packet sources and existing views on the
server, as well as to configure and create new views, capture jobs, etc.
There are many more classes in the Shark libraries, representing things like views, capture jobs, trace clips, etc. But these should never be instantiated directly from scripts, they are returned by methods on Shark objects.
Shark objects
The Shark class is the main interface to interact with a Shark Appliance. Among other things, it makes it possible to manage views, jobs, files and trace clips, and to query and modify the appliance settings.
Shark(host, port=None, auth=None, force_version=None, force_ssl=None)
host
is the name or IP address of the Shark to connect to
port
is the TCP port on which the Shark appliance listens.
if this parameter is not specified, the function will
try to automatically determine the port.
auth
defines the authentication method and credentials to use
to access the Shark. It should be an instance of
rvbd.common.UserAuth or rvbd.common.OAuth.
force_version
is the API version to use when communicating.
if unspecified, this will use the latest version supported
by both this implementation and the Shark appliance.
See the base Service class for more information about additional functionality supported.
The following methods provide general information about a Shark
Get the Shark appliance overall info.
Return Value: a named tuple with the server parameters
Return the API protocol version used by this shark.
Return a dictionary with the information useful prior to logging in to the shark, including the protocol version, banner, and whether or not a login purpose should be requested from the user.
Get the Shark appliance storage info
The following methods are used to access views. Each of these methods returns a view object.
Get a list of View objects, one for each open view on the Shark appliance.
get_open_view_by_handle(handle)
Look up the view handle
and return its View object
Create a new view on this Shark.
src
identifies the source of packets to be analyzed.
It may be any packet source object.
columns
specifies what information is extracted from
packets and presented in this view. It should be a list
of Key
and Value
objects.
filters
is an optional list of filters that can be used
to limit which packets from the packet source are processed
by this view.
create_view_from_template(source, template, name=None, sync=True)
Create a new view on this Shark using template
.
For 9.5.x and earlier, template
should be an XML view
document, for later releases, template
should be a
JSON view description.
The following methods are used to access packet sources
(e.g., to obtain an object that can be used as an argument to
create_view
, create_job
, etc...
The objects they return are described below in the section
Packet source objects.
get_interfaces(force_refetch=False)
Return a list of Interface objects, corresponding to the capture interfaces available on the Shark.
If force_refetch
is True, the list of interfaces will be
re-fetched from the Shark. Otherwise, the list may be cached.
get_interface_by_name(ifname, force_refetch=False)
Return an Interface object corresponding to the
interface named ifname
.
If force_refetch
is True, the list of interfaces will be
re-fetched from the Shark. Otherwise, the list may be cached.
get_capture_jobs(force_refetch=False)
Return a list of CaptureJob objects, corresponding to the capture jobs currently running on the Shark.
If force_refetch
is True, the list of jobs will be
re-fetched from the Shark. Otherwise, the list may be cached.
get_capture_job_by_name(jobname, force_refetch=False)
Return a CaptureJob object for the capture job jobname
If force_refetch
is True, the list of jobs will be
re-fetched from the Shark. Otherwise, the list may be cached.
Create a Capture Job on this Shark, and return it as a a capture job object that can be used to work with the new job.
interface
is an interface object identifying the source of the packets
that this job will capture. The method get_interfaces()
returns
interface objects for all interfaces on this shark.
name
is the name that the job will have after being created,
and that will be used to identify the job for successive operations.
packet_retention_size_limit
is the maximum size on disk that the job
will reach before old packets begin to be discarded.
packet_retention_size_limit
can be expressed as a number
(in which case the unit will be bytes), or as a string.
The format of the string can be Number and unit (e.g. "1.3GB"),
or percentage of total disk spoace (e.g. "20%").
packet_retention_packet_limit
: the maximum number of packets that
the job can contain before old packets begin to be discarded.
packet_retention_time_limit
a datetime.timedelta
object
specifying the maximum time interval that the job may contain before
old packets begin to be discarded.
bpf_filter
is a filter, based on the Wireshark capture filter
syntax, that selects which incoming packets should be saved
in the job.
snap_length
is the portion of each packet that will be written on
disk, in bytes. The default value of 65535 ensure that every packet is
captured in its entirety.
indexing_size_limit
is the maximum size on disk that the index may
reach before old index data may be discarded.
It is specified in the same format as keep_size
.
If indexing_synced
is True the job Microflow Index will be
automatically pruned to cover the same time extension of the captured
packets.
indexing_time_limit
is a datetime.timedelta
object indicating
how long index data should be retained. This argument is only
meaningful if index_and_capture_synced
is False.
If start_immediately
is True, the Job is started after it has been
created.
requested_start_time
and requested_stop_time
are datetime.datetime
objects that, if specified, determine the absolute time when the
job will start/stop.
If stop_rule_size_limit
is specified, the job will stop storing
new packets when the given size on disk is reached. It is specified
in the same format as keep_size
.
If stop_rule_packet_limit
is specified, the job will stop storing new
packets when the given number of packets is reached.
If stop_rule_time_limit
is specified, it should be a
datetime.timedelta
object, and the job will stop storing new
packets when the given time has elapsed.
get_clips(force_refetch=False)
Return a list of TraceClip objects, corresponding to the trace clips available on the Shark.
If force_refetch
is True, the list of clips will be
re-fetched from the Shark. Otherwise, the list may be cached.
create_clip(job, filters, description, locked=True)
Create a clip in the Shark appliance
get_trace_clip_by_description(description, force_refetch=False)
Return a TraceClip object for the trace clip with the given description.
If force_refetch
is True, the list of jobs will be
re-fetched from the Shark. Otherwise, the list may be cached.
Note: Clips don't have descriptions by default. A description can be added to a clip by right-clicking on it in Pilot.
get_files(force_refetch=False)
Return a list of TraceFile, MergedFile or Multisegment objects, corresponding to the trace files available on the Shark.
If force_refetch
is True, the list of files will be
re-fetched from the Shark. Otherwise, the list may be cached.
The following methods are used to work directly with trace files on the Shark appliance filesystem:
Get a directory. It will trigger an exception if the directory does not exist. Return Value: a reference to the directory
Given a path retrieve the File
associated with it
Check if a path exists, works for files or directories. Return Value: true if the path exists, false otherwise
Create a new directory. It will trigger an exception if the directory exist. Return Value: a reference to the new directory
create_multisegment_file(path, files=None)
Creates a multisegment file. 'path' is the new file full name and 'files' is a File objects list Return Value: a reference to the new file
create_merged_file(path, files=None)
Creates a merged file. 'path' is the new file full name and 'files' is a File objects list Return Value: a reference to the new file
upload_trace_file(path, local_file)
Uploads a new trace file. 'path' is the complete destination path, 'local_file' is the the local file to upload. Return Value: reference to the new trace file
The following methods are used to access extractor fields.
Return a list of all extractor fields available on this shark
find_extractor_field_by_name(fname)
Return a specific extractor field given its name
search_extractor_fields(string)
Search through the extractor fields to find the ones that match the given string in either the field name or the description.
Packet source objects
The objects described in this section are used to access packet sources.
None of these objects are directly instantiated from external code,
they are returned by methods on Shark
or other routines.
Any of the objects in this section may be used as the src
argument to Shark.create_view
.
Capture Job objects
Capture job objects are used to work with capture jobs.
These objects are not instantiated directly but are returned from
Shark.get_capture_jobs
and Shark.get_capture_job_by_name
.
Capture job objects have the following properties:
The capture job name
The capture job actual size, corresponding to the one shown by the Shark UI shows.
The capture job maximum size, corresponding to the one shown by the Shark UI shows.
(no docstring)
(no docstring)
an :py:class:Interface
object, representing the interface used as a packet
source for this job.
The internal capture job handle. The handle is sometimes required for advanced operations on the job.
The following methods access information about a job:
Return status information about the capture job.
Return the state of the job (e.g. RUNNING, STOPPED)
Return statistics about the capture job.
Return statistics about the capture job index.
The following methods are useful for controlling a capture job:
Start a job in the Shark appliance
Stop a job in the Shark appliance
Clear data in the Shark appliance
The following methods can be used to create and delete jobs,
though create()
does the same thing as Shark.create_clip()
.
Create a new capture job
Delete job from the Shark appliance
Finally, these methods are useful for creating trace clips and for downloading raw packets from a capture job.
add_clip(filters, description, locked=True)
Create a new trace clip under this job.
filters
limit which packets from the clip should go into
this clip. Since all packets in the new clip will be kept on
disk if the clip is locked, this field typically includes
time filters to limit the clip to a fixed time interval.
description
is a name for the clip, it is shown next to the
clip in the Pilot user interface.
If locked
is True, packets in the new clip will not be deleted
from disk as long as the clip exists. Note that locking packets
in trace clips necessarily reduces the amount of disk capacity
for existing capture jobs.
Returns a trace clip object representing the new clip.
export(filename, filters=None)
Export the CaptureJob packets selected by filters to a file
Trace Clip objects
Trace clip objects are used to work with trace clips.
These objects are not instantiated directly but are returned from
methods such as Shark.get_clips
.
These methods provide a way to obtain clip objects, though it
is usually easier to use methods like Shark.get_clips
.
(no docstring)
Get the complete list of trace files on given a Shark.
shark
is an rvbd.shark.shark.Shark
object
Returns a list of TraceFile
objects
Trace clip objects have the following properties:
Returns the description of the clip
Returns the size of the clip
add(shark, job, filters, description, locked=True)
Create a new clip given a Shark connection and a job handle.
shark
is a Shark object
description
will be associated with the new clip.
(The description is shown near the clip in grey in Pilot)
job
is the capture job to use
filters
is the list of filters to associate to the clip.
In order for the clip to be valid, there must be at list one
time filter in this list
Returns a trace clip object for the new clip.
Erase the clip from shark
Extractor Field objects
Extractor Field objects represent individual extractor fields.
These objects are returned by Shark.get_extractor_fields
,
Shark.find_extractor_field_by_name
, and Shark.search_extractor_fields
.
Each extractor field is a python
namedtuple
with the following fields:
name
: the name of the field, e.g.,ip.source_ip
desc
: a brief description of the fieldtype
: a string describing the type of the field (e.g., integer, string ip address, etc.)
View objects
View objects are returned from Shark.create_view
.
A View object encapsulates everything needed to read data from an existing view on a Shark. Every view has one or more associated outputs. For example, the standard "Bandwidth over time" view has separate outputs for "bits over time", "bytes over time", and "packets over time". In flyscript, a View object contains an associated Output object for each output. To read data from a view, you must first locate the appropriate Output object, then use the method !methodref Output.get_data .
View objects have the following methods:
Returns a boolean indicating whether data can be retrieved from this view or not. If this function returns False, the view data is still being computed, its progress can be followed with the method get_progress().
For views applied to non-live sources (i.e., to trace clips or trace files), returns an integer between 0 and 100 indicating the percentage of the packet source that has been processed. Output data is not available on the view until this value reaches 100%
Return a dictionary object with details about the time range covered by this view. The returned object has the following entries:
start
: the time of the first packet for which data is availableend
: the end time of the last sample (XXX explain better)delta
: the size of each sample
Close this view on the server (which permanently deletes the view plus any associated configuration and output data).
Returns the legend from the output in this view.
Shorthand for all_outputs()[0].get_legend()
.
Raises a LookupError if the view has more than one output.
Returns the data from the output in this view.
Shorthand for all_outputs()[0].get_data()
.
Raises a LookupError if the view has more than one output.
For a full description of the function arguments, refer to the method Output.get_data().
Returns an iterator for the data from the output
in this view. Shorthand for all_outputs()[0].get_iterdata()
.
Raises a LookupError if the view has more than one output.
For a full description of the function arguments, refer to the method Output.get_iterdata()
Return a list of Output objects, one for each output in this view
Return the Output object corresponding to
the output id
.
Output objects
Return the legend for this output. The legend consists of an ordered list of entries, one for each column of data in this output. Each entry is a dictionary object with the following entries:
name
: A short name for the fielddescription
: A slightly longer, more descriptive name for the fieldfield
: the name of the extractor field that produces this columncalculation
: how data from multiple packets in a sample is aggregated (e.g., "SUM", "AVG", etc.)
The following parameters are intended for internal shark use:
type
id
base
dimension
Get the data for this view. This function downloads the whole dataset before returning it, so it's useful when random access to the view data is necessary.
The arguments have the same meanings as corresponding arguments to get_iterdata(), see its documentation for a full explanation of all arguments and their meanings.
Returns an iterator to the output data. This function is ideal for sequential parsing of the view data, because it downloads the dataset incrementally as it is accessed.
start
and end
are datetime.datetime
objects representing
the earliest and latest packets that should be considered.
If start
and end
are unspecified, the start/end of the
underlying packet source are used.
delta
is a datetime.timedelta
object that can be used to
override the default data aggregation interval. If this
parameter is unspecified, the underlying view sample interval
(which defaults to 1 second) is used. If this parameter is
specified, it must be an even multiple of the underlying
view sample interval.
If aggregated
is True, the parameter delta
is automatically
computed to be the full extent of this request (i.e., the difference
between the effective start and end times). This is useful if
you do not care about timeseries data (e.g., if the data from
this view is to be plotted in a single chart that has no
time component).
The sortby
parameter is one of the fields of the output (x1, x2 ...)
The sorttype
can be:
ascending
: the output is sorted from smallest to largest
descending
: the output is sorted largest to smallest
The fromentry
parameter represent the first sorted item we want
to appear in the output. 0 means from the first one.
The toentry
parameter represent the last sorted item we want to
appear in the output. 0 means all of them.
rvbd.shark.viewutils
This is a set of utility functions for working with views.
Utilities for writing view data
Print the data of a given view output to stdout.
widths
is an optional list of integers, specifying how
many characters wide each column should be. If it is not
specified, reasonable defaults are chosen.
If limit
is specified, only the first limit
rows are printed.
line_prefix
is a string that is printed at the start of every line.
write_csv(filename, legend, stream, include_column_names=True, include_sample_times=True)
Saves the data of a view output to a comma separated values (csv) file.
legend
is an output legend, typically just the result
of output.get_legend()
stream
is a series of data samples, typically the result
of output.get_data()
or output.get_iterdata()
If include_column_names
is True, the first line in the file
will be a summary row indicating the fileds that the file contains.
If include_sample_times
is True, the first column will be a
timestamp.
Mixing multiple view outputs
class rvbd.shark.viewutils.OutputMixer
Helper class that blends multiple data streams (ie from View.get_data) into a single combined stream. For example, given a "Bandwidth Over Time" view with separate outputs for bytes and packets, this class can be used to create a single output stream with bytes and packets columns.
Mixing is only supported on simple time-based views that do not include any keys (e.g., bandwidth over time, etc.)
See examples/shark/readview.py for typical usage.
OutputMixer()
Add new source to mixer
src
is time-based view object
Return the legend for each of the source objects
Return a generator for the combined stream of outputs from each source object