Streaming Service Overview
The Oracle Cloud Infrastructure Streaming service provides a fully managed, scalable, and durable storage solution for ingesting continuous, high-volume streams of data that you can consume and process in real time. Streaming can be used for messaging, ingesting high-volume data such as application logs, operational telemetry, web click-stream data, or other use cases in which data is produced and processed continually and sequentially in a publish-subscribe messaging model.
Streaming Usage Scenarios
Here are some of the many possible uses for Streaming:
- Metric and log ingestion: Use streaming as an alternative for traditional file-scraping approaches to help make critical operational data more quickly available for indexing, analysis, and visualization.
- Messaging: Use streaming to decouple components of large systems. Streaming provides a pull/buffer-based communication model with sufficient capacity to flatten load spikes and the ability to feed multiple consumers with the same data independently. Key-scoped ordering and guaranteed durability provide reliable primitives to implement various messaging patterns, while high throughput potential allows for such a system to scale well.
- Web/Mobile activity data ingestion: Use streaming for capturing activity from websites or mobile apps (such as page views, searches, or other actions users may take). This information can be used for real-time monitoring and analytics, as well as in data warehousing systems for offline processing and reporting.
- Infrastructure and apps event processing: Use streaming as a unified entry point for cloud components to report their life cycle events for audit, accounting, and related activities.
The following concepts are essential to working with Streaming.
- A partitioned, append-only log of messages.
- A partition is a section of a stream. Partitions allow you to distribute a stream by splitting messages across multiple nodes. Each partition can be placed on a separate machine to allow for multiple consumers to read from a stream in parallel.
- A Base64-encoded record that is published to a stream.
- An entity that publishes messages to a stream.
- An entity that reads messages from one or more streams.
- Consumer Group
- A consumer group is a set of instances which coordinates messages from all of the partitions in a stream. Instances in a consumer group maintain group membership through interaction; lack of interaction for a period of time results in a timeout, removing the instance from the group.
- An identifier used to group related messages.
- The location of a message within a partition. You can use the offset to restart reading from a stream.
- You can use IAM to set permissions on the following operations: list, get, update, create, and delete streams.
- Creating a stream using the Console or API.
- Using a producer to publish data to the stream.
- Building consumers to read and process messages from a stream using the GetMessages API .
A pointer to a location in a stream. This location could be a pointer to a specific offset or time in a partition, or to a groups' current location.
How Streaming Works
The Streaming service provides a robust, scalable mechanism that you can use to produce and consume high volumes of data between application components.
Here's how Streaming works: a producer publishes messages to a stream, which is an append-only log. These messages are distributed among the partitions using the message's key.
Streams are divided into a number of partitions for scalability. Partitions allow you to distribute a stream by splitting messages across multiple nodes (or brokers). Each partition can be placed on a separate machine to allow multiple consumers to read a stream in parallel. Multiple consumers can read from any partition regardless of where the partition is hosted.
A consumer can read messages from one or more streams. Each message within a stream is marked with an offset value, so a consumer can pick up where it left off if it is interrupted.
You can use the streaming service by:
Limits on Streaming Resources
The Streaming service has the following limitations:
- Message retention of up to a maximum 7 days
- Throughput is limited to 1MB per second per partition
- Each partition can handle up to 1MB maximum message size
- Each partition can handle 5 GetMessages API calls per second
- Each partition can support up a maximum total data write rate of 1MB per second
- Each enterprise tenancy has a limit of 5 partitions (non-enterprise partitions have a limit of 0, but you can request more)
See Service Limits for a list of applicable limits and instructions for requesting a limit increase.
Most types of Oracle Cloud Infrastructure resources have a unique, Oracle-assigned identifier called an Oracle Cloud ID (OCID). For information about the OCID format and other ways to identify your resources, see Resource Identifiers.
Ways to Access Oracle Cloud Infrastructure
You can access Oracle Cloud Infrastructure using the Console (a browser-based interface) or the REST API. Instructions for the Console and API are included in topics throughout this guide. For a list of available SDKs, see Software Development Kits and Command Line Interface.
To access the Console, you must use a supported browser. You can use the Console link at the top of this page to go to the sign-in page. You will be prompted to enter your cloud tenant, your user name, and your password.
For general information about using the API, see REST APIs.
To get started with Streaming, see the following topics:
- For instructions on how to manage streams, see Managing Streams.
- For information about publishing messages to a stream, see Publishing Messages.
- For information on how to consume messages, see Consuming Messages.
- For SDK information, see Oracle Cloud Infrastructure SDKs.
Authentication and Authorization
Each service in Oracle Cloud Infrastructure integrates with IAM for authentication and authorization, for all interfaces (the Console, SDK or CLI, and REST API).
An administrator in your organization needs to set up A collection of users who all need a particular type of access to a set of resources or compartment., A collection of related resources that can be accessed only by certain groups that have been given permission by an administrator in your organization., and An IAM document that specifies who has what type of access to your resources. It is used in different ways: to mean an individual statement written in the policy language; to mean a collection of statements in a single, named "policy" document (which has an Oracle Cloud ID (OCID) assigned to it); and to mean the overall body of policies your organization uses to control access to resources. that control which users can access which services, which resources, and the type of access. For example, the policies control who can create new users, create and manage the cloud network, launch instances, create buckets, download objects, etc. For more information, see Getting Started with Policies. For specific details about writing policies for each of the different services, see Policy Reference.
If you’re a regular user (not an administrator) who needs to use the Oracle Cloud Infrastructure resources that your company owns, contact your administrator to set up a user ID for you. The administrator can confirm which compartment or compartments you should be using.
For common policies used to authorize Streaming users, see Common Policies.
For in-depth information on granting users permissions for the Streaming service, see Details for the Streaming Service in the IAM policy reference.
You can apply tags to your resources to help you organize them according to your business needs. You can apply tags at the time you create a resource, or you can update the resource later with the desired tags. For general information about applying tags, see Resource Tags.
Last edited: 5/24/2019 8:34 AM