Network Working Group                                     H. Schulzrinne
Request for Comments: 4733                                   Columbia U.
Obsoletes: 2833                                                T. Taylor
Category: Standards Track                                         Nortel
                                                           December 2006

RTP Payload for DTMF Digits, Telephony Tones, and Telephony Signals


Status of This Memo


This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.

この文書は、インターネットコミュニティのためのインターネット標準トラックプロトコルを指定し、改善のための議論と提案を要求します。このプロトコルの標準化状態と状態への「インターネット公式プロトコル標準」(STD 1)の最新版を参照してください。このメモの配布は無制限です。

Copyright Notice


Copyright (C) The IETF Trust (2006).




This memo describes how to carry dual-tone multifrequency (DTMF) signalling, other tone signals, and telephony events in RTP packets. It obsoletes RFC 2833.

このメモはRTPパケットでデュアルトーン多重周波数(DTMF)シグナリング、他のトーン信号、およびテレフォニーイベントを運ぶ方法について説明します。これは、RFC 2833を廃止します。

This memo captures and expands upon the basic framework defined in RFC 2833, but retains only the most basic event codes. It sets up an IANA registry to which other event code assignments may be added. Companion documents add event codes to this registry relating to modem, fax, text telephony, and channel-associated signalling events. The remainder of the event codes defined in RFC 2833 are conditionally reserved in case other documents revive their use.

このメモは、キャプチャし、RFC 2833で定義された基本的な枠組みを発展させ、唯一の最も基本的なイベントコードを保持します。これは、他のイベントコードの割り当てを添加することができる先のIANAレジストリを設定します。コンパニオン文書は、モデム、ファックス、テキスト電話、及びチャネル関連シグナリングのイベントに関連する、このレジストリにイベントコードを追加します。 RFC 2833で定義されたイベントコードの残りの部分は、条件付きで他の文書は、その使用を復活させる場合に予約されています。

This document provides a number of clarifications to the original document. However, it specifically differs from RFC 2833 by removing the requirement that all compliant implementations support the DTMF events. Instead, compliant implementations taking part in out-of-band negotiations of media stream content indicate what events they support. This memo adds three new procedures to the RFC 2833 framework: subdivision of long events into segments, reporting of multiple events in a single packet, and the concept and reporting of state events.

この文書では、元の文書に明確化の数を提供します。しかし、それは特に、すべての準拠した実装がDTMFイベントをサポート要件を除去することにより、RFC 2833とは異なります。代わりに、メディア・ストリーム・コンテンツのアウトオブバンド交渉に参加して準拠の実装は、彼らがサポートして何のイベントを示しています。セグメントに長いイベントの細分化、単一パケット内の複数のイベントの報告、および状態イベントの概念と報告:このメモは、RFC 2833の枠組みには3つの新しい手順を追加します。

Table of Contents


   1. Introduction ....................................................4
      1.1. Terminology ................................................4
      1.2. Overview ...................................................4
      1.3. Potential Applications .....................................5
      1.4. Events, States, Tone Patterns, and Voice-Encoded Tones .....6
   2. RTP Payload Format for Named Telephone Events ...................8
      2.1. Introduction ...............................................8
      2.2. Use of RTP Header Fields ...................................8
           2.2.1. Timestamp ...........................................8
           2.2.2. Marker Bit ..........................................8
      2.3. Payload Format .............................................8
           2.3.1. Event Field .........................................9
           2.3.2. E ("End") Bit .......................................9
           2.3.3. R Bit ...............................................9
           2.3.4. Volume Field ........................................9
           2.3.5. Duration Field ......................................9
      2.4. Optional Media Type Parameters ............................10
           2.4.1. Relationship to SDP ................................10
      2.5. Procedures ................................................11
           2.5.1. Sending Procedures .................................11
           2.5.2. Receiving Procedures ...............................16
      2.6. Congestion and Performance ................................19
           2.6.1. Performance Requirements ...........................20
           2.6.2. Reliability Mechanisms .............................20
           2.6.3. Adjusting to Congestion ............................22
   3. Specification of Event Codes for DTMF Events ...................23
      3.1. DTMF Applications .........................................23
      3.2. DTMF Events ...............................................25
      3.3. Congestion Considerations .................................25
   4. RTP Payload Format for Telephony Tones .........................26
      4.1. Introduction ..............................................26
      4.2. Examples of Common Telephone Tone Signals .................27
      4.3. Use of RTP Header Fields ..................................27
           4.3.1. Timestamp ..........................................27
           4.3.2. Marker Bit .........................................27
           4.3.3. Payload Format .....................................28
           4.3.4. Optional Media Type Parameters .....................29
      4.4. Procedures ................................................29
           4.4.1. Sending Procedures .................................29
           4.4.2. Receiving Procedures ...............................30
           4.4.3. Handling of Congestion .............................30
   5. Examples .......................................................31
   6. Security Considerations ........................................38
   7. IANA Considerations ............................................38
      7.1. Media Type Registrations ..................................40
           7.1.1. Registration of Media Type audio/telephone-event ...40
           7.1.2. Registration of Media Type audio/tone ..............42
   8. Acknowledgements ...............................................43
   9. References .....................................................43
      9.1. Normative References ......................................43
      9.2. Informative References ....................................44
   Appendix A. Summary of Changes from RFC 2833 ......................46
1. Introduction
1. はじめに
1.1. Terminology
1.1. 用語

In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC 2119 [1].

この文書では、キーワード "MUST"、 "MUST NOT"、 "REQUIRED"、 "NOT SHALL"、 "推奨"、 "すべきではない" "べきである" "ないものと"、 "MAY"、および "オプション" [1] RFC 2119に記載されるように解釈されるべきです。

This document uses the following abbreviations:


ANSam Answer tone (amplitude modulated) [24]


DTMF Dual-Tone Multifrequency [10]


IVR Interactive Voice Response unit


PBX Private branch exchange (telephone system)


PSTN Public Switched (circuit) Telephone Network


RTP Real-time Transport Protocol [5]


SDP Session Description Protocol [9]


1.2. Overview
1.2. 概要

This memo defines two RTP [5] payload formats, one for carrying dual-tone multifrequency (DTMF) digits and other line and trunk signals as events (Section 2), and a second one to describe general multifrequency tones in terms only of their frequency and cadence (Section 4). Separate RTP payload formats for telephony tone signals are desirable since low-rate voice codecs cannot be guaranteed to reproduce these tone signals accurately enough for automatic recognition. In addition, tone properties such as the phase reversals in the ANSam tone will not survive speech coding. Defining separate payload formats also permits higher redundancy while maintaining a low bit rate. Finally, some telephony events such as "on-hook" occur out-of-band and cannot be transmitted as tones.

このメモは、イベント(セクション2)のようにデュアルトーン多周波(DTMF)数字と他のラインおよびトランク信号を搬送するための2つのRTP [5]ペイロードフォーマット、いずれかを定義し、2つ目は、その周波数の点で一般的な多周波トーンを記述するためにケイデンス(第4章)。低レートの音声コーデックを自動認識するために十分正確にこれらのトーン信号を再生することを保証することはできませんので、電話トーン信号に対して別々のRTPペイロードフォーマットが望まれています。また、ANSamのトーンで、このような位相反転などのトーンのプロパティは、音声符号化に耐えられないだろう。低ビットレートを維持しながら、別のペイロードフォーマットを定義することも高い冗長性を可能にします。最後に、「オンフック」など、いくつかのテレフォニーイベントは、アウト・オブ・バンド起こるとトーンとして送信することはできません。

The remainder of this section provides the motivation for defining the payload types described in this document. Section 2 defines the payload format and associated procedures for use of named events. Section 3 describes the events for which event codes are defined in this document. Section 4 describes the payload format and associated procedures for tone representations. Section 5 provides some examples of encoded events, tones, and combined payloads. Section 6 deals with security considerations. Section 7 defines the IANA requirements for registration of event codes for named telephone events, establishes the initial content of that registry, and provides the media type registrations for the two payload formats. Appendix A describes the changes from RFC 2833 [12] and in particular indicates the disposition of the event codes defined in [12].

このセクションの残りは、この文書に記載されたペイロードタイプを定義するための動機を提供します。セクション2は、ペイロード・フォーマットおよび名前付きイベントを使用するための関連する手順を定義します。第3節では、イベントコードは、この文書で定義されているイベントについて説明します。セクション4は、ペイロード形式と階調表現のために関連する手順が記載されています。セクション5は、符号化されたイベント、トーン、合わせたペイロードのいくつかの例を提供します。セキュリティの考慮事項と第6節では扱っています。セクション7は、名前、電話イベントのイベントコードの登録のためのIANA要件を定義するレジストリの初期コンテンツを確立し、2つのペイロード形式のメディアタイプ登録を提供します。付録Aは、RFC 2833 [12]からの変更点について説明し、特に[12]で定義されたイベントコードの配置を示しています。

1.3. Potential Applications
1.3. 潜在的なアプリケーション

The payload formats described here may be useful in a number of different scenarios.


On the sending side, there are two basic possibilities: either the sending side is an end system that originates the signals itself, or it is a gateway with the task of propagating incoming telephone signals into the Internet.


On the receiving side, there are more possibilities. The first is that the receiver must propagate tone signalling accurately into the PSTN for machine consumption. One example of this is a gateway passing DTMF tones to an IVR. In this scenario, frequencies, amplitudes, tone durations, and the durations of pauses between tones are all significant, and individual tone signals must be delivered reliably and in order.


In a second receiving scenario, the receiver must play out tones for human consumption. Typically, rather than a series of tone signals each with its own meaning, the content will consist of a single tone played out continuously or a single sequence of tones and possibly silence, repeated cyclically for some period of time. Often the end of the tone playout will be triggered by an event fed back in the other direction, using either in- or out-of-band means. Examples of this are dial tone or busy tone.


The relationship between position in the network and the tones to be played out is a complicating factor in this scenario. In the phone network, tones are generated at different places, depending on the switching technology and the nature of the tone. This determines, for example, whether a person making a call to a foreign country hears her local tones she is familiar with or the tones as used in the country called.


For analog lines, dial tone is always generated by the local switch. Integrated Services Digital Network (ISDN) terminals may generate dial tone locally and then send a Q.931 [22] SETUP message containing the dialed digits. If the terminal just sends a SETUP message without any Called Party digits, then the switch does digit collection (provided by the terminal as KEYPAD key press digit information within Called Party or Keypad Facility Information Elements (IEs) of INFORMATION messages), and provides dial tone over the B-channel. The terminal can either use the audio signal on the B-channel or use the Q.931 messages to trigger locally generated dial tone.

アナログ回線の場合は、ダイヤルトーンは、常にローカルスイッチによって生成されます。統合デジタルサービス通信網(ISDN)端末は、ローカルにダイヤルトーンを生成し、ダイヤルされた数字を含むQ.931 [22] SETUPメッセージを送信することができます。端末はただの着信側桁なしでSETUPメッセージを送信した場合、スイッチは、(INFORMATIONメッセージの着信側またはキーパッドファシリティ情報要素(IE)内のキーパッドのキーを押して数字情報として、端末によって提供される)数字収集を行い、ダイヤルを提供しますB-チャネルを介しトーン。 Bチャンネルのオーディオ信号を使用するか、局所的にトリガするQ.931メッセージを使用するか、端末がダイヤルトーンを生成しました。

Ringing tone (also called ringback tone) is generated by the local switch at the callee, with a one-way voice path opened up as soon as the callee's phone rings. (This reduces the chance of clipping the called party's response just after answer. It also permits pre-answer announcements or in-band call-progress indications to reach the caller before or in lieu of a ringing tone.) Congestion tone and special information tones can be generated by any of the switches along the way, and may be generated by the caller's switch based on ISDN User Part (ISUP) messages received. Busy tone is generated by the caller's switch, triggered by the appropriate ISUP message, for analog instruments, or the ISDN terminal.

音を鳴らす(とも呼ばれるリングバックトーン)呼び出し先の電話が鳴るとすぐに開け一方向の音声パスで、呼び出し先のローカルスイッチによって生成されます。 (これは単に答えた後、着信側の応答をクリッピングする可能性を低減します。また、前または呼び出し音の代わりに、発信者に到達する前の回答の発表やインバンドコールプログレスの表示を可能にします。)輻輳トーンと特別な情報トーン道に沿ってスイッチのいずれかによって生成することができ、そして(ISUP)メッセージが受信ISDNユーザ部に基づいて、発信者のスイッチによって生成されてもよいです。ビジートーンは、発呼者のアナログ機器の適切なISUPメッセージによってトリガスイッチ、またはISDN端末によって生成されます。

In the third scenario, an end system is directly connected to the Internet and processes the incoming media stream directly. There is no need to regenerate tone signals, so that time alignment and power levels are not relevant. These systems rely on sending systems to generate events in place of tones and do not perform their own audio waveform analysis. An example of such a system is an Internet interactive voice response (IVR) system.


In circumstances where exact timing alignment between the audio stream and the DTMF digits or other events is not important and data is sent unicast, as in the IVR example, it may be preferable to use a reliable control protocol rather than RTP packets. In those circumstances, this payload format would not be used.


Note that in a number of these cases it is possible that the gateway or end system will be both a sender and receiver of telephone signals. Sometimes the same class of signals will be sent as received -- in the case of "RTP trunking" or voice-band data, for instance. In other cases, such as that of an end system serving analogue lines, the signals sent will be in a different class from those received.

これらの場合の数がゲートウェイまたはエンド・システムが電話信号の送信側と受信側の両方になることが可能であることに留意されたいです。受け取った時には信号の同じクラスが送信される - 「RTPトランキング」又は音声帯域データの場合には、例えば。そのようなアナログ回線サービスを提供エンドシステムのような他のケースでは、送信された信号は、受信されたものとは異なるクラスであろう。

1.4. Events, States, Tone Patterns, and Voice-Encoded Tones
1.4. イベント、アメリカ、トーンパターン、および音声符号化処理トーン

This document provides the means for in-band transport over the Internet of two broad classes of signalling information: in-band tones or tone sequences, and signals sent out-of-band in the PSTN. Tone signals can be carried using any of the three methods listed below. Depending on the application, it may be desirable to carry the signalling information in more than one form at once.


1. The gateway or end system can change to a higher-bandwidth codec such as G.711 [19] when tone signals are to be conveyed. See new ITU-T Recommendation V.152 [26] for a formal treatment of this approach. Alternatively, for fax, text, or modem signals respectively, a specialized transport such as T.38 [23], RFC 4103 [15], or V.150.1 modem relay [25] may be used. Finally, 64 kbit/s channels may be carried transparently using the RFC 4040 Clearmode payload type [14]. These methods are out of scope of the present document, but may be used along with the payload types defined here.

トーン信号を搬送する場合1.ゲートウエイ又はエンドシステムは、G.711 [19]などの高帯域幅コーデックに変更することができます。このアプローチの正式な治療のための新しいITU-T勧告V.152 [26]を参照してください。あるいは、ファックス、テキスト、またはモデム信号のそれぞれ、例えばT.38 [23]、RFC 4103 [15]、またはV.150.1モデムリレー[25]などの特殊な輸送を使用することができます。最後に、64キロビット/秒チャネルはRFC 4040クリアモードペイロードタイプ[14]を使用して透過的に行うことができます。これらの方法は、本文書の範囲外であるが、ここで定義されたペイロードタイプと一緒に使用することができます。

2. The sending gateway can simply measure the frequency components of the voice-band signals and transmit this information to the RTP receiver using the tone representation defined in this document (Section 4). In this mode, the gateway makes no attempt to discern the meaning of the tones, but simply distinguishes tones from speech signals. An end system may use the same approach using configured rather than measured frequencies.


       All tone signals in use in the PSTN and meant for human
       consumption are sequences of simple combinations of sine waves,
       either added or modulated.  (However, some modem signals such as
       the ANSam tone [24] or systems dependent on phase shift keying
       cannot be conveyed so simply.)

3. As a third option, a sending gateway can recognize tones such as ringing or busy tone or DTMF digit '0', and transmit a code that identifies them using the telephone-event payload defined in this document (Section 2). The receiver then produces a tone signal or other indication appropriate to the signal. Generally, since the recognition of signals at the sender often depends on their on/off pattern or the sequence of several tones, this recognition can take several seconds. On the other hand, the gateway may have access to the actual signalling information that generates the tones and thus can generate the RTP packet immediately, without the detour through acoustic signals.


The third option (use of named events) is the only feasible method for transmitting out-of-band PSTN signals as content within RTP sessions.


2. RTP Payload Format for Named Telephone Events
2.1. Introduction
2.1. 前書き

The RTP payload format for named telephone events is designated as "telephone-event", the media type as "audio/telephone-event". In accordance with current practice, this payload format does not have a static payload type number, but uses an RTP payload type number established dynamically and out-of-band. The default clock frequency is 8000 Hz, but the clock frequency can be redefined when assigning the dynamic payload type.

名前の電話イベントのためのRTPペイロードフォーマットは、「電話・イベント」、「オーディオ/電話イベント」などのメディアタイプとして指定されています。現在の慣行に従って、このペイロードフォーマットは、静的なペイロードタイプ番号を有するが、動的に帯域外確立されたRTPペイロードタイプ番号を使用しません。デフォルトのクロック周波数が8000 Hzであるが、ダイナミックペイロードタイプを割り当てるときにクロック周波数を再定義することができます。

Named telephone events are carried as part of the audio stream and MUST use the same sequence number and timestamp base as the regular audio channel to simplify the generation of audio waveforms at a gateway. The named telephone-event payload type can be considered to be a very highly-compressed audio codec and is treated the same as other codecs.


2.2. Use of RTP Header Fields
2.2. RTPヘッダフィールドの使用
2.2.1. Timestamp
2.2.1. タイムスタンプ

The event duration described in Section 2.5 begins at the time given by the RTP timestamp. For events that span multiple RTP packets, the RTP timestamp identifies the beginning of the event, i.e., several RTP packets may carry the same timestamp. For long-lasting events that have to be split into segments (see below, Section, the timestamp indicates the beginning of the segment.


2.2.2. Marker Bit
2.2.2. マーカービット

The RTP marker bit indicates the beginning of a new event. For long-lasting events that have to be split into segments (see below, Section, only the first segment will have the marker bit set.


2.3. Payload Format
2.3. ペイロードフォーマット

The payload format for named telephone events is shown in Figure 1.


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   |     event     |E|R| volume    |          duration             |

Figure 1: Payload Format for Named Events


2.3.1. Event Field
2.3.1. イベント・フィールド

The event field is a number between 0 and 255 identifying a specific telephony event. An IANA registry of event codes for this field has been established (see IANA Considerations, Section 7). The initial content of this registry consists of the events defined in Section 3.


2.3.2. E ("End") Bit
2.3.2. E( "終了")ビット

If set to a value of one, the "end" bit indicates that this packet contains the end of the event. For long-lasting events that have to be split into segments (see below, Section, only the final packet for the final segment will have the E bit set.


2.3.3. R Bit
2.3.3. Rビット

This field is reserved for future use. The sender MUST set it to zero, and the receiver MUST ignore it.


2.3.4. Volume Field
2.3.4. ボリュームフィールド

For DTMF digits and other events representable as tones, this field describes the power level of the tone, expressed in dBm0 after dropping the sign. Power levels range from 0 to -63 dBm0. Thus, larger values denote lower volume. This value is defined only for events for which the documentation indicates that volume is applicable. For other events, the sender MUST set volume to zero and the receiver MUST ignore the value.


2.3.5. Duration Field
2.3.5. 期間フィールド

The duration field indicates the duration of the event or segment being reported, in timestamp units, expressed as an unsigned integer in network byte order. For a non-zero value, the event or segment began at the instant identified by the RTP timestamp and has so far lasted as long as indicated by this parameter. The event may or may not have ended. If the event duration exceeds the maximum representable by the duration field, the event is split into several contiguous segments as described below (Section


The special duration value of zero is reserved to indicate that the event lasts "forever", i.e., is a state and is considered to be effective until updated. A sender MUST NOT transmit a zero duration for events other than those defined as states. The receiver SHOULD ignore an event report with zero duration if the event is not a state.


Events defined as states MAY contain a non-zero duration, indicating that the sender intends to refresh the state before the time duration has elapsed ("soft state").


For a sampling rate of 8000 Hz, the duration field is sufficient to express event durations of up to approximately 8 seconds.


2.4. Optional Media Type Parameters
2.4. オプションのメディアタイプパラメータ

As indicated in the media type registration for named events in Section 7.1.1, the telephone-event media type supports two optional parameters: the "events" parameter and the "rate" parameter.


The "events" parameter lists the events supported by the implementation. Events are listed as one or more comma-separated elements. Each element can be either a single integer providing the value of an event code or an integer followed by a hyphen and a larger integer, presenting a range of consecutive event code values. The list does not have to be sorted. No white space is allowed in the argument. The union of all of the individual event codes and event code ranges designates the complete set of event numbers supported by the implementation.


The "rate" parameter describes the sampling rate, in Hertz, and hence the units for the RTP timestamp and event duration fields. The number is written as an integer. If omitted, the default value is 8000 Hz.

「速度」パラメータはヘルツのサンプリングレート、およびRTPタイムスタンプ、イベントの持続時間フィールドのそれゆえに単位を記載します。数は整数として書かれています。省略した場合、デフォルト値は8000 Hzです。

2.4.1. Relationship to SDP
2.4.1. SDPとの関係

The recommended mapping of media type optional parameters to SDP is given in Section 3 of RFC 3555 [6]. The "rate" media type parameter for the named event payload type follows this convention: it is expressed as usual as the <clock rate> component of the a=rtpmap: attribute line.

SDPのメディアタイプのオプションパラメータの推奨マッピングは、RFC 3555のセクション3に記載されている[6]。名前のイベントペイロードタイプのため「率」メディアタイプのパラメータは、この規則に従います:属性行:それは、A = rtpmapの<クロックレート>要素としていつものように表されます。

The "events" media type parameter deviates from the convention suggested in RFC 3555 because it omits the string "events=" before the list of supported events.

「イベント」メディアタイプのパラメータは、それがサポートされているイベントのリストの前に文字列「イベントを=」省略ので、RFC 3555で提案されている慣例から外れました。

a=fmtp:<format> <list of values>


The list of values has the format and meaning described above.


For example, if the payload format uses the payload type number 100, and the implementation can handle the DTMF tones (events 0 through 15) and the dial and ringing tones (assuming as an example that these were defined as events with codes 66 and 70, respectively), it would include the following description in its SDP message:


m=audio 12346 RTP/AVP 100 a=rtpmap:100 telephone-event/8000 a=fmtp:100 0-15,66,70

M =オーディオ12346 RTP / AVP 100 = rtpmap:100電話イベント/ 8000 =のfmtp:100 0-15,66,70

The following sample media type definition corresponds to the SDP example above:



オーディオ/電話イベント;イベント= "0-15,66,70";割合= "8000"

2.5. Procedures
2.5. 手順

This section defines the procedures associated with the named event payload type. Additional procedures may be specified in the documentation associated with specific event codes.


2.5.1. Sending Procedures
2.5.1. 送信手順 Negotiation of Payloads。ペイロードの交渉

Events are usually sent in combination with or alternating with other payload types. Payload negotiation may specify separate event and other payload streams, or it may specify a combined stream that mixes other payload types with events using RFC 2198 [2] redundancy headers. The purpose of using a combined stream may be for debugging or to ease the transition between general audio and events.

イベントは、通常、他のペイロードタイプと交互に、または組み合わせて送信されます。ペイロードネゴシエーションは別のイベントおよび他のペイロード・ストリームを指定することができる、またはそれはRFC 2198を使用して、イベントと他のペイロードタイプ[2]冗長ヘッダを混合合成ストリームを指定することができます。合わせたストリームを使用する目的は、デバッグのためであってもよいし、一般的なオーディオとイベントとの間の移行を容易にします。

Negotiation of payloads between sender and receiver is achieved by out-of-band means, using SDP, for example.


The sender SHOULD indicate what events it supports, using the optional "events" parameter associated with the telephone-event media type. If the sender receives an "events" parameter from the receiver, it MUST restrict the set of events it sends to those listed in the received "events" parameter. For backward compatibility, if no "events" parameter is received, the sender SHOULD assume support for the DTMF events 0-15 but for no other events.


Events MAY be sent in combination with older events using RFC 2198 [2] redundancy. Section describes how this can be used to avoid packet and RTP header overheads when retransmitting final event reports. Section 2.6 discusses the use of additional levels of RFC 2198 redundancy to increase the probability that at least one copy of

イベントは、RFC 2198 [2]の冗長性を使用して古いイベントとの組み合わせで送信することができます。セクション2.5.1.4は、最終的なイベントレポートを再送信する際に、これは、パケットおよびRTPヘッダのオーバーヘッドを回避するために使用することができる方法を説明します。 2.6節は、少なくとも1つのコピーの確率を高めるためにRFC 2198の冗長性の追加レベルの使用について説明します

the report of the end of an event reaches the receiver. The following SDP shows an example of such usage, where G.711 audio appears in a separate stream, and the primary component of the redundant payload is events.


m=audio 12344 RTP/AVP 99 a=rtpmap:99 pcmu/8000 m=audio 12346 RTP/AVP 100 101 a=rtpmap:100 red/8000/1 a=fmtp:100 101/101/101 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-15

M =オーディオ12344 RTP / AVP 99 = rtpmap:99 PCMU / 8000メートル=オーディオ12346 RTP / AVP 100 101 = rtpmap:100赤/ 1分の8000 A =のfmtp:100 101/101/101 = rtpmap:101電話-イベント/ 8000 =のfmtp:101 0-15

When used in accordance with the offer-answer model (RFC 3264 [4]), the SDP a=ptime: attribute indicates the packetization period that the author of the session description expects when receiving media. This value does not have to be the same in both directions. The appropriate period may vary with the application, since increased packetization periods imply increased end-to-end response times in instances where one end responds to events reported from the other.

([4] RFC 3264)オファー・アンサーモデルに従って使用される場合、SDPのA = PTIME:属性は、メディアを受信した場合、セッション記述の作成者が期待するパケット化期間を示します。この値は、両方向で同じである必要はありません。増加パケット期間は、一端が他の報告されたイベントに応答する場合において、エンドツーエンドの応答時間を増加暗示するので、適切な期間は、アプリケーションに応じて変化してもよいです。

Negotiation of telephone-events sessions using SDP MAY specify such differences by separating events corresponding to different applications into different streams. In the example below, events 0-15 are DTMF events, which have a fairly wide tolerance on timing. Events 32-49 and 52-60 are events related to data transmission and are subject to end-to-end response time considerations. As a result, they are assigned a smaller packetization period than the DTMF events.


m=audio 12344 RTP/AVP 99 a=rtpmap:99 telephone-event/8000 a=fmtp:99 0-15 a=ptime:50 m=audio 12346 RTP/AVP 100 a=rtpmap:100 telephone-event/8000 a=fmtp:100 32-49,52-60 a=ptime:30

M =オーディオ12344 RTP / AVP 99 = rtpmap:99電話イベント/ 8000 =のfmtp:99 0-15 = PTIME 50 M =オーディオ12346 RTP / AVP 100 = rtpmap:100電話イベント/ 8000 =のfmtp:100 32-49,52-60 = PTIME:30

For further discussion of packetization periods see Section 2.6.3.

パケット化期間の更なる議論については、セクション2.6.3を参照してください。 Transmission of Event Packets。イベントパケットの送信

DTMF digits and other named telephone events are carried as part of the audio stream, and they MUST use the same sequence number and timestamp base as the regular audio channel to simplify the generation of audio waveforms at a gateway.


An audio source SHOULD start transmitting event packets as soon as it recognizes an event and continue to send updates until the event has ended. The update packets MUST have the same RTP timestamp value as the initial packet for the event, but the duration MUST be increased to reflect the total cumulative duration since the beginning of the event.


The first packet for an event MUST have the M bit set. The final packet for an event MUST have the E bit set, but setting of the "E" bit MAY be deferred until the final packet is retransmitted (see Section Intermediate packets for an event MUST NOT have either the M bit or the E bit set.


Sending of a packet with the E bit set is OPTIONAL if the packet reports two events that are defined as mutually exclusive states, or if the final packet for one state is immediately followed by a packet reporting a mutually exclusive state. (For events defined as states, the appearance of a mutually exclusive state implies the end of the previous state.)

パケットは、相互に排他的な状態として定義されている2つのイベントを報告する、または1つの状態のための最後のパケットが直ちにパケットが続く場合は相互に排他的な状態を報告する場合はEビットがセットされたパケットの送信はオプションです。 (状態として定義されるイベントについては、相互に排他的な状態の外観は、以前の状態の終わりを意味しています。)

A source has wide latitude as to how often it sends event updates. A natural interval is the spacing between non-event audio packets. (Recall that a single RTP packet can contain multiple audio frames for frame-based codecs and that the packet interval can vary during a session.) Alternatively, a source MAY decide to use a different spacing for event updates, with a value of 50 ms RECOMMENDED.

ソースは、イベントの更新を送信する頻度についての広いラチチュードを持っています。自然の間隔は、非イベントのオーディオパケット間の間隔です。 (単一のRTPパケットはフレームベースのコーデックのための複数の音声フレームを含み、パケット間隔がセッション中に変更することができることができることを想起されたい。)あるいは、源は、50ミリ秒の値を用いて、イベントの更新のために異なる間隔を使用することを決定することができますお勧めします。

Timing information is contained in the RTP timestamp, allowing precise recovery of inter-event times. Thus, the sender does not in theory need to maintain precise or consistent time intervals between event packets. However, the sender SHOULD minimize the need for buffering at the receiving end by sending event reports at constant intervals.


DTMF digits and other tone events are sent incrementally to avoid having the receiver wait for the completion of the event. In some cases (for example, data session startup protocols), waiting until the end of a tone before reporting it will cause the session to fail. In other cases, it will simply cause undesirable delays in playout at the receiving end.


For robustness, the sender SHOULD retransmit "state" events periodically.

堅牢性のために、送信者は、定期的に「状態」イベントを再送すべきです。 Long-Duration Events。長時間のイベント

If an event persists beyond the maximum duration expressible in the duration field (0xFFFF), the sender MUST send a packet reporting this maximum duration but MUST NOT set the E bit in this packet. The sender MUST then begin reporting a new "segment" with the RTP timestamp set to the time at which the previous segment ended and the duration set to the cumulative duration of the new segment. The M bit of the first packet reporting the new segment MUST NOT be set. The sender MUST repeat this procedure as required until the end of the complete event has been reached. The final packet for the complete event MUST have the E bit set (either on initial transmission or on retransmission as described below).

イベントは、期間フィールド(0xFFFFで)で表現できる最大時間を超えて持続する場合、送信者は、この最大期間を報告するパケットを送らなければなりませんが、このパケット内のEビットを設定してはいけません。送信者は、前のセグメントが終了した時刻に設定されたRTPタイムスタンプと新しいセグメントの累積期間に設定した時間で新しい「セグメント」を報告し始めなければなりません。新しいセグメントを報告する最初のパケットのMビットを設定してはいけません。 completeイベントの終わりに達するまで必要に応じて、送信者は、この手順を繰り返す必要があります。 (後述のように、初期伝送または再送信のいずれか)は、完全なイベントの最後のパケットはEビットを設定する必要があります。 Exceptional Procedure for Combined Payloads。複合ペイロードのための例外的な手順

If events are combined as a redundant payload with another payload type using RFC 2198 [2] redundancy, the above procedure SHALL be applied, but using a maximum duration that ensures that the timestamp offset of the oldest generation of events in an RFC 2198 packet never exceeds 0x3FFF. If the sender is using a constant packetization period, the maximum segment duration can be calculated from the following formula:

イベント[2]冗長RFC 2198を使用して、別のペイロードタイプと重複ペイロードとして組み合わされる場合、上記の手順を適用するが、タイムスタンプはRFC 2198パケットにおける事象の最も古い世代のオフセットが決して保証しない最大時間を使用することがSHALL 0x3FFFのは超えています。送信者が一定期間パケット化を使用している場合、最大セグメント持続時間は、以下の式から計算することができます。

maximum duration = 0x3FFF - (R-1)*(packetization period in timestamp units)

最大持続時間= 0x3FFFの - (R-1)*(タイムスタンプ単位でパケット化周期)

where R is the highest redundant layer number consisting of event payload.


The RFC 2198 redundancy header timestamp offset value is only 14 bits, compared with the 16 bits in the event payload duration field. Since with other payloads the RTP timestamp typically increments for each new sample, the timestamp offset value becomes limiting on reported event duration. The limit becomes more constraining when older generations of events are also included in the combined payload.

RFC 2198冗長ヘッダのタイムスタンプオフセット値は、イベント・ペイロード期間フィールドの16ビットと比較のみ14ビットです。他のペイロードのRTPタイムスタンプは、典型的には、各新しいサンプルに対して増分するので、タイムスタンプオフセット値は、報告されたイベントの期間に制限となります。イベントの古い世代も組み合わせペイロードに含まれている場合の制限は、より多くの制約になります。 Retransmission of Final Packet。最終的なパケットの再送

The final packet for each event and for each segment SHOULD be sent a total of three times at the interval used by the source for updates. This ensures that the duration of the event or segment can be recognized correctly even if an instance of the last packet is lost.


A sender MAY use RFC 2198 [2] with up to two levels of redundancy to combine retransmissions with reports of new events, thus saving on header overheads. In this usage, the primary payload is new event reports, while the first and (if necessary) second levels of redundancy report first and second retransmissions of final event reports. Within a session negotiated to allow such usage, packets containing the RFC 2198 payload SHOULD NOT be sent except when both primary and retransmitted reports are to be included. All other packets of the session SHOULD contain only the simple, non-redundant telephone-event payload. Note that the expected proportion of simple versus redundant packets affects the order in which they should be specified on an SDP m= line.

送信者は、このように、ヘッダのオーバーヘッドを節約、新しいイベントのレポートを再送信を組み合わせることが、冗長性の最大2つのレベルの[2] RFC 2198を使用するかもしれません。冗長性の(必要に応じて)第一および第二のレベルは、最終的なイベントレポートの第一および第二の再送信を報告しながら、この使用法では、一次ペイロードは、新しいイベントレポートです。このような使用を許可するように交渉したセッションの中で、RFC 2198ペイロードを含むパケットは、プライマリおよび再送さの両方のレポートが含まれている場合を除き、送るべきではありません。セッションの他のすべてのパケットは、単純な、非冗長電話イベントペイロードを含むべきです。冗長パケットに対する簡単の予想比率は、それらがSDPのM =行で指定する順序に影響を与えることに留意されたいです。

There is little point in sending initial or interim event reports redundantly because each succeeding packet describes the event fully (except for typically irrelevant variations in volume).


A sender MAY delay setting the E bit until retransmitting the last packet for a tone, rather than setting the bit on its first transmission. This avoids having to wait to detect whether the tone has indeed ended. Once the sender has set the E bit for a packet, it MUST continue to set the E bit for any further retransmissions of that packet.

送信者は、トーンのための最後のパケットを再送信するのではなく、その最初の送信時にビットを設定するまで、Eビットをセット遅らせる可能。これは、トーンが実際に終了したかどうかを検出するのを待つ必要がなくなります。送信者は、パケットのためのEビットをセットしたら、それはそのパケットの任意のさらなる再送信のためのEビットをセットし続けなければなりません。 Packing Multiple Events into One Packet。 1つのパケットに複数のイベントをパッキング

Multiple named events can be packed into a single RTP packet if and only if the events are consecutive and contiguous, i.e., occur without overlap and without pause between them, and if the last event packed into a packet occurs quickly enough to avoid excessive delays at the receiver.


This approach is similar to having multiple frames of frame-based audio in one RTP packet.


The constraint that packed events not overlap implies that events designated as states can be followed in a packet only by other state events that are mutually exclusive to them. The constraint itself is needed so that the beginning time of each event can be calculated at the receiver.


In a packet containing events packed in this way, the RTP timestamp MUST identify the beginning of the first event or segment in the packet. The M bit MUST be set if the packet records the beginning of at least one event. (This will be true except when the packet carries the end of one segment and the beginning of the next segment of the same long-lasting event.) The E bit and duration for each event in the packet MUST be set using the same rules as if that event were the only event contained in the packet.

このように充填されたイベントを含むパケットには、RTPタイムスタンプは、パケット内の最初のイベントまたはセグメントの始まりを特定しなければなりません。パケットが少なくとも1つのイベントの始まりを記録した場合Mビットが設定しなければなりません。パケット内の各イベントのEビットと持続時間が同じルールを使用して設定しなければなりません(これは、パケットが一つのセグメントの端部と同じ長期的なイベントの次のセグメントの開始を搬送する。以外は真なり)そのイベントは、パケットに含まれる唯一のイベントだったら。 RTP Sequence Number。 RTPシーケンス番号

The RTP sequence number MUST be incremented by one in each successive RTP packet sent. Incrementing applies to retransmitted as well as initial instances of event reports, to permit the receiver to detect lost packets for RTP Control Protocol (RTCP) receiver reports.


2.5.2. Receiving Procedures
2.5.2. 受信手続き Indication of Receiver Capabilities Using SDP。 SDPを使用したレシーバ機能の表示

Receivers can indicate which named events they can handle, for example, by using the Session Description Protocol (RFC 4566 [9]). SDP descriptions using the event payload MUST contain an fmtp format attribute that lists the event values that the receiver can process.

レシーバは、例えば、セッション記述プロトコル(RFC 4566 [9])を使用することによって、彼らが扱うことができるイベントを命名するかを示すことができます。イベント・ペイロードを使用してSDP記述は、受信機が処理できるイベントの値を示していますのfmtpフォーマット属性を含まなければなりません。 Playout of Tone Events。トーンイベントの再生

In the gateway scenario, an Internet telephony gateway connecting a packet voice network to the PSTN re-creates the DTMF or other tones and injects them into the PSTN. Since, for example, DTMF digit recognition takes several tens of milliseconds, the first few milliseconds of a digit will arrive as regular audio packets. Thus, careful time and power (volume) alignment between the audio samples and the events is needed to avoid generating spurious digits at the receiver. The receiver may also choose to delay playout of the tones by some small interval after playout of the preceding audio has ended, to ensure that downstream equipment can discriminate the tones properly.


Some implementations send events and encoded audio packets (e.g., PCMU or the codec used for speech signals) for the same time instant for the duration of the event. It is RECOMMENDED that gateways render only the telephone-event payload once it is received, since the audio may contain spurious tones introduced by the audio compression algorithm. However, it is anticipated that these extra tones in general should not interfere with recognition at the far end.


Receiver implementations MAY use different algorithms to create tones, including the two described here. (Note that not all implementations have the need to re-create a tone; some may only care about recognizing the events.) With either algorithm, a receiver may impose a playout delay to provide robustness against packet loss or delay. The tradeoff between playout delay and other factors is discussed further in Section 2.6.3.

受信機の実装は、ここで説明した2つを含むトーンを、作成するために、異なるアルゴリズムを使用するかもしれません。 (すべてではない実装がトーンを再作成する必要があることに注意してください。いくつかのイベントのみを認識することを気にする。)のいずれかのアルゴリズムでは、受信機は、パケットロスや遅延に対する堅牢性を提供するために、プレイアウト遅延を課すことができます。プレイアウト遅延やその他の要因の間のトレードオフは、セクション2.6.3でさらに説明されます。

In the first algorithm, the receiver simply places a tone of the given duration in the audio playout buffer at the location indicated by the timestamp. As additional packets are received that extend the same tone, the waveform in the playout buffer is extended accordingly. (Care has to be taken if audio is mixed, i.e., summed, in the playout buffer rather than simply copied.) Thus, if a packet in a tone lasting longer than the packet interarrival time gets lost and the playout delay is short, a gap in the tone may occur.

最初のアルゴリズムでは、受信機は、単にタイムスタンプによって示される位置でのオーディオ再生バッファ内の指定された持続時間のトーンを配置します。追加のパケットは、同じトーンを拡張が受信されると、再生バッファの波形はそれに応じて延長されます。 (ケアは、オーディオが混合されている場合は注意しなければならない、すなわち、再生バッファではなく、単にコピーして、合算。)したがって、パケット間時間よりも長く持続音でパケットが失われると、プレイアウト遅延が短い場合、Aトーンのギャップが生じる可能性があります。

Alternatively, the receiver can start a tone and play it until one of the following occurs:


o it receives a packet with the E bit set;


o it receives the next tone, distinguished by a different timestamp value (noting that new segments of long-duration events also appear with a new timestamp value);


o it receives an alternative non-event media stream (assuming none was being received while the event stream was active); or


o a given time period elapses.


This is more robust against packet loss, but may extend the tone beyond its original duration if all retransmissions of the last packet in an event are lost. Limiting the time period of extending the tone is necessary to avoid that a tone "gets stuck". This algorithm is not a license for senders to set the duration field to zero; it MUST be set to the current duration as described, since this is needed to create accurate events if the first event packet is lost, among other reasons.


Regardless of the algorithm used, the tone SHOULD NOT be extended by more than three packet interarrival times. A slight extension of tone durations and shortening of pauses is generally harmless.


A receiver SHOULD NOT restart a tone once playout has stopped. It MAY do so if the tone is of a type meant for human consumption or is one for which interruptions will not cause confusion at the receiving device.


If a receiver receives an event packet for an event that it is not currently playing out and the packet does not have the M bit set, earlier packets for that event have evidently been lost. This can be confirmed by gaps in the RTP sequence number. The receiver MAY determine on the basis of retained history and the timestamp and event code of the current packet that it corresponds to an event already played out and lapsed. In that case, further reports for the event MUST be ignored, as indicated in the previous paragraph.


If, on the other hand, the event has not been played out at all, the receiver MAY attempt to play the event out to the complete duration indicated in the event report. The appropriate behavior will depend on the event type, and requires consideration of the relationship of the event to audio media flows and whether correct event duration is essential to the correct operation of the media session.


A receiver SHOULD NOT rely on a particular event packet spacing, but instead MUST use the event timestamps and durations to determine timing and duration of playout.


The receiver MUST calculate jitter for RTCP receiver reports based on all packets with a given timestamp. Note: The jitter value should primarily be used as a means for comparing the reception quality between two users or two time periods, not as an absolute measure.


If a zero volume is indicated for an event for which the volume field is defined, then the receiver MAY reconstruct the volume from the volume of non-event audio or MAY use the nominal value specified by the ITU Recommendation or other document defining the tone. This ensures backwards compatibility with RFC 2833 [12], where the volume field was defined only for DTMF events.

ゼロボリュームがボリュームフィールドが定義されたイベントのために示されている場合、受信機は、非イベント音声のボリュームからボリュームを再構成するか、またはトーンを定義するITU勧告又は他の文書で指定された公称値を使用することができます。これは、ボリュームフィールドのみDTMFイベント用に定義されたRFC 2833 [12]との下位互換性を保証します。 Long-Duration Events。長時間のイベント

If an event report is received with duration equal to the maximum duration expressible in the duration field (0xFFFF) and the E bit for the report is not set, the event report may mark the end of a segment generated according to the procedures of Section If another report for the same event type is received, the receiver MUST compare the RTP timestamp for the new event with the sum of the RTP timestamp of the previous report plus the duration (0xFFFF). The receiver uses the absence of a gap between the events to detect that it is receiving a single long-duration event.


The total duration of a long-duration event is (obviously) the sum of the durations of the segments used to report it. This is equal to the duration of the final segment (as indicated in the final packet for that segment), plus 0xFFFF multiplied by the number of segments preceding the final segment.

長時間のイベントの総継続時間は、(明らかに)それを報告するために使用されるセグメントの持続時間の合計です。これは、(そのセグメントの最後のパケットで示されているように)最後のセグメントの持続時間に等しい、プラス0xFFFFでは、最終的なセグメントに先行するセグメントの数を乗じました。 Exceptional Procedure for Combined Payloads。複合ペイロードのための例外的な手順

If events are combined as a redundant payload with another payload type using RFC 2198 [2] redundancy, segments are generated at intervals of 0x3FFF or less, rather than 0xFFFF, as required by the procedures of Section in this case. If a receiver is using the events component of the payload, event duration may be only an approximate indicator of division into segments, but the lack of an E bit and the adjacency of two reports with the same event code are strong indicators in themselves.

イベントは、RFC 2198 [2]の冗長性、セグメントは0x3FFFの以下の間隔で発生しているを使用して、別のペイロードタイプと重複ペイロードとして組み合わされる場合、むしろ0xFFFFではなく、この場合、セクション2.の手順により必要とされます。受信機は、ペイロードのイベント・コンポーネントを使用している場合、イベントの持続時間は、セグメントに分割するだけ近似指標であってもよいが、Eビットの欠如と同じイベントコードを持つ2つのレポートの隣接は、それ自体に強い指標です。 Multiple Events in a Packet。パケット内の複数のイベント

The procedures of Section require that if multiple events are reported in the same packet, they are contiguous and non-overlapping. As a result, it is not strictly necessary for the receiver to know the start times of the events following the first one in order to play them out -- it needs only to respect the duration reported for each event. Nevertheless, if knowledge of the start time for a given event after the first one is required, it is equal to the sum of the start time of the preceding event plus the duration of the preceding event.

節2.5.1.5の手順では、複数のイベントが同じパケットで報告されている場合、彼らは連続しており、非重複していることが必要です。受信機はそれらを再生するために、最初の1以下のイベントの開始時間を知っているため、結果として、それは厳密には必要ではない - それは唯一のイベントごとに報告された期間を尊重する必要があります。最初の後の所定のイベントの開始時間の知識が必要とされる場合、それにもかかわらず、それは前のイベントに加え、先行イベントの期間の開始時間との和に等しいです。 Soft States。ソフト州

If the duration of a soft state event expires, the receiver SHOULD consider the value of the state to be "unknown" unless otherwise indicated in the event documentation.


2.6. Congestion and Performance
2.6. 輻輳とパフォーマンス

Packet transmission through the Internet is marked by occasional periods of congestion lasting on the order of second, during which network delay, jitter, and packet loss are all much higher than they are in between these periods. Reference [28] characterizes this phenomenon. Well-behaved applications are expected, preferably, to reduce their demands on the network during such periods of congestion. At the least, they should not increase their demands. This section explores both application performance and the possibilities for good behavior in the face of congestion.


2.6.1. Performance Requirements
2.6.1. 性能要件

Typically, an implementation of the telephone-event payload will aim to limit the rate at which each of the following impairments occurs:


a. an event encoded at the sender fails to be played out at the receiver, either because the event report is lost or because it arrives after playout of later content has started;


b. the start of playout of an event at the receiver is delayed relative to other events or other media operating on the same timestamp base;


c. the duration of playout of a given event differs from the correct duration as detected at the sender by more than a given amount;


d. gaps occur in playout of a given event;


e. end-to-end delay for the media stream exceeds a given value.


The relative importance of these constraints varies between applications.


2.6.2. Reliability Mechanisms
2.6.2. 信頼性のメカニズム

To improve reliability, all payload types including telephone-events can use a jitter buffer, i.e., impose a playout delay, at the receiving end. This mechanism addresses the first four requirements listed above, but at the expense of the last one.


The named event procedures provide two complementary redundancy mechanisms to deal with lost packets:


a. Intra-event updates:


       Events that last longer than one packetization period (e.g., 50
       ms) are updated periodically, so that the receiver can
       reconstruct the event and its duration if it receives any of the
       update packets, albeit with delay.

During an event, the RTP event payload format provides incremental updates on the event. The error resiliency afforded by this mechanism depends on whether the first or second algorithm in Section is used and on the playout delay at the receiver. For example, if the receiver uses the first algorithm and only places the current duration of tone signal in the playout buffer, for a playout delay of 120 ms and a packetization interval of 50 ms, two packets in a row can get lost without causing a premature end of the tone generated.


b. Repeat last event packet:


       As described in Section, the last report for an event is
       transmitted a total of three times.  This mechanism adds
       robustness to the reporting of the end of an event.

It may be necessary to extend the level of redundancy to achieve requirement a) (in Section 2.6.1) in a specific network environment. Taking the 25-30% loss rate during congestion periods illustrated in [28] as typical, and setting an objective that at least 99% of end-of-event reports will eventually get through to the receiver under these conditions, simple probability calculations indicate that each event completion has to be reported four times. This is one more level of redundancy than required by the basic "Repeat last event packet" algorithm. Of course, the objective is probably unrealistically stringent; it was chosen to make a point.

特定のネットワーク環境において()セクション2.6.1に)要件aを達成するために、冗長性のレベルを拡張する必要があるかもしれません。 [28]などの一般的に示す輻輳期間中25~30%の損失率を取って、エンド・オブ・イベントレポートの少なくとも99%が最終的にこれらの条件下で、受信機に介して取得することを目的設定、単純な確率計算が示します各イベントの完了を4回報告されなければならないことを。これは、基本的な「リピート最後のイベントパケット」アルゴリズムによって必要とされるよりも、冗長性の1つの以上のレベルです。もちろん、目的はおそらく非現実厳しいです。それはポイントを作るために選ばれました。

Where Section indicates that it is appropriate to use the RFC 2198 [2] audio redundancy mechanism to carry retransmissions of final event reports, this mechanism MAY also be used to extend the number of final report retransmissions. This is done by using more than two levels of redundancy when necessary. The use of RFC 2198 helps to mitigate the extra bandwidth demands that would be imposed simply by retransmitting final event packets more than three times.

セクション2.5.1.4は、最後のイベントレポートの再送信を運ぶためにRFC 2198 [2]オーディオ冗長メカニズムを使用することが適切であることを示している場合は、このメカニズムは、最終報告書の再送信の数を拡張するために使用されるかもしれません。これは、必要なときに、冗長性のつ以上のレベルを使用して行われます。 RFC 2198を使用すると、3回を超えて、最終的なイベントパケットを再送するだけで課せられる追加の帯域幅要求を軽減するのに役立ちます。

These two redundancy mechanisms clearly address requirement a) in the previous section. They also help meet requirement c), to the extent that the redundant packets arrive before playout of the events they report is due to expire. They are not helpful in meeting the other requirements, although they do not directly cause impairments themselves in the way that a large jitter buffer increases end-to-end delay.


The playout algorithm is an additional mechanism for meeting the performance requirements. In particular, using the second algorithm in Section will meet requirement d) of the previous section by preventing gaps in playout, but at the potential cost of increases in duration (requirement c)).


Finally, there is an interaction between the packetization period used by a sender, the playout delay used by the receiver, and the vulnerability of an event flow to packet losses. Assuming packet losses are independent, a shorter packetization interval means that the receiver can use a smaller playout delay to recover from a given number of consecutive packet losses, at any stage of event playout. This improves end-to-end delays in applications where that matters.


In view of the tradeoffs between the different reliability mechanisms, documentation of specific events SHOULD include a discussion of the appropriate design decisions for the applications of those events. This mandate is repeated in the section on IANA considerations.


2.6.3. Adjusting to Congestion
2.6.3. 輻輳に調整

So far, the discussion has been about meeting performance requirements. However, there is also the question of whether applications of events can adapt to congestion to the point that they reduce their demands on the networks during congestion. In theory this can be done for events by increasing the packetization interval, so that fewer packets are sent per second. This has to be accompanied by an increased playout delay at the receiving end. Coordination between the two ends for this purpose is an interesting issue in itself. If it is done, however, such an action implies a one-time gap or extended playout of an event when the packetization interval is first extended, as well as increased end-to-end delay during the whole period of increased playout delay.


The benefit from such a measure varies primarily depending on the average duration of the events being handled. In the worst case, as a first example shows, the reduction in aggregate bandwidth usage due to an increased packetization interval may be quite modest. Suppose the average event duration is 3.33 ms (V.21 bits, for instance). Suppose further that four transmissions in total are required for a given event report to meet the loss objective. Table 1 shows the impact of varying packetization intervals on the aggregate bit rate of the media stream.


   | Packetization      | Packets/s |     IP Packet |     Total IP Bit |
   | Interval (ms)      |           |   Size (bits) |    Rate (bits/s) |
   | 50                 |        20 |          2440 |            48800 |
   | 33.3               |        30 |          1800 |            54000 |
   | 25                 |        40 |          1480 |            59200 |
   | 20                 |        50 |          1288 |            64400 |

Table 1: Data Rate at the IP Level versus Packetization Interval (three retransmissions, 3.33 ms per event)


As can be seen, a doubling of the interval (from 25 to 50 ms) drops aggregate bit rate by about 20% while increasing end-to-end delay by 25 ms and causing a one-time gap of the same amount. (Extending the playout of a specific V.21 tone event is out of the question, so the first algorithm of Section must be used in this application.) The reduction in number of packets per second with longer packetization periods is countered by the increase in packet size due to the increase in number of events per packet.

図から分かるように25ミリ秒エンドツーエンド遅延を増大させ、同量のワンタイムギャップをさせながら、(25〜50秒から)間隔の倍加は、約20%総計ビットレートを低下します。 (セクション2.5.2.2の最初のアルゴリズムは、本出願において使用されなければならないので、特定V.21トーンイベントの再生を拡張することは、問題外である。)より長いパケット化周期の1秒あたりのパケット数の減少は、によって打ち消されますパケットあたりのイベント数の増加によるパケットサイズの増加。

For events of longer duration, the reduction in bandwidth is more proportional to the increase in packetization interval. The loss of final event reports may also be less critical, so that lower redundancy levels are acceptable. Table 2 shows similar data to Table 1, but assuming 70-ms events separated by 50 ms of silence (as in an idealized DTMF-based text messaging session) with only the basic two retransmissions for event completions.


   | Packetization      | Packets/s |     IP Packet |     Total IP Bit |
   | Interval (ms)      |           |   Size (bits) |    Rate (bits/s) |
   | 50                 |        20 |       448/520 |            10040 |
   | 33.3               |        30 |       448/520 |            14280 |
   | 25                 |        40 |       448/520 |            18520 |
   | 20                 |        50 |           448 |            22400 |

Table 2: Data Rate at the IP Level versus Packetization Interval (two retransmissions, 70 ms per event, 50 ms between events)


In the third column of the table, the packet size is 448 bits when only one event is being reported and 520 bits when the previous event is also included. No more than one level of redundancy is needed up to a packetization interval of 50 ms, although at that point most packets are reporting two events. Longer intervals require a second level of redundancy in at least some packets.


3. Specification of Event Codes for DTMF Events

This document defines one class of named events: DTMF tones.


3.1. DTMF Applications
3.1. DTMFアプリケーション

DTMF signalling [10] is typically generated by a telephone set or possibly by a PBX (Private branch telephone exchange). DTMF digits may be consumed by entities such as gateways or application servers in the IP network, or by entities such as telephone switches or IVRs in the circuit switched network.

DTMFシグナリング[10]は、典型的にはPBX(構内電話交換機)によって、おそらく設定または電話によって生成されます。 DTMFディジットは、IPネットワーク内のゲートウェイ又はアプリケーションサーバなどのエンティティによって、又はこのような回路交換ネットワークにおける電話スイッチまたはのIVRなどのエンティティによって消費されてもよいです。

The DTMF events support two possible applications at the sending end:


1. The Internet telephony gateway detects DTMF on the incoming circuits and sends the RTP payload described here instead of regular audio packets. The gateway likely has the necessary digital signal processors and algorithms, as it often needs to detect DTMF, e.g., for two-stage dialing. Having the gateway detect tones relieves the receiving Internet end system from having to do this work and also avoids having low bit-rate codecs like G.723.1 [20] render DTMF tones unintelligible.


2. An Internet end system such as an "Internet phone" can emulate DTMF functionality without concerning itself with generating precise tone pairs and without imposing the burden of tone recognition on the receiver.


A similar distinction occurs at the receiving end.


1. In the gateway scenario, an Internet telephony gateway connecting a packet voice network to the PSTN re-creates the DTMF tones or other telephony events and injects them into the PSTN.


2. In the end system scenario, the DTMF events are consumed by the receiving entity itself.


In the most common application, DTMF tones are sent in one direction only, typically from the calling end. The consuming device is most commonly an IVR. DTMF may alternate with voice from either end. In most cases, the only constraint on tone duration is that it exceed a minimum value. However, in some cases a long-duration tone (in excess of 1-2 seconds) has special significance.

最も一般的なアプリケーションでは、DTMFトーンは呼び出し側から、通常、一方向にのみ送信されます。消費装置は、最も一般的にIVRです。 DTMFは、どちらかの端からの音声と交互にしてもよいです。ほとんどの場合、トーンの持続時間に唯一の制約は、それが最小値を超えることです。しかし、いくつかのケースでは(1〜2秒を超える)長時間のトーンは特別な意味を持っています。

ITU-T Recommendation Q.24 [11], Table A-1, indicates that the legacy switching equipment in the countries surveyed expects a minimum recognizable signal duration of 40 ms, a minimum pause between signals of 40 ms, and a maximum signalling rate of 8 to 10 digits per second depending on the country. Human-generated DTMF signals, of course, are generally longer with larger pauses between them.

ITU-T勧告Q.24 [11]、表A-1は、調査対象国におけるレガシー交換装置40ミリ秒の最小認識可能な信号時間、40ミリ秒の信号の間の最小の休止、および最大シグナリングレートを期待することを示します国によって毎秒8〜10桁の。人間が生成したDTMF信号は、当然のことながら、それらの間に大きな休止と概して長いです。

DTMF tones may also be used for text telephony. This application is documented in ITU-T Recommendation V.18 [27] Annex B. In this case, DTMF is sent alternately from either end (half-duplex mode), with a minimum 300-ms turn-around time. The only constraints on tone durations in this application are that they and the pauses between them must exceed specified minimum values. It is RECOMMENDED that a gateway at the sending end be capable of detecting DTMF signals as specified by V.18 Annex B (tones and pauses >=40 ms), but should send event durations corresponding to those of a V.18 DTMF sender (tones >=70 ms, pauses >=50 ms). This may occasionally imply some degree of buffering of outgoing events, but if the source terminal conforms to V.18 Annex B, this should not get out of hand.

DTMFトーンは、テキストテレフォニーのために使用することができます。このアプリケーションは、この場合にはITU-T勧告V.18 [27]付録Bに記載され、DTMFは最小300ミリ秒のターンアラウンドタイムと、両端(半二重モード)から交互に送信されます。このアプリケーションでは音長の唯一の制約は、彼らと、それらの間の一時停止が指定された最小値を超えなければならないということです。 (送信側のゲートウェイがV.18アネックスB(トーンと休止> = 40ミリ秒)で指定されたDTMF信号を検出することができることが推奨されるが、V.18 DTMF送信者のものに対応するイベントの持続時間を送信しますトーン> = 70ミリ秒)> = 50ミリ秒休止します。これは、時折、発信イベントのバッファリングをある程度暗示かもしれませんが、ソース端子は、V.18附属書Bに準拠している場合、これは手に負えなくてはなりません。

Since minor increases in tone duration are harmless for all applications of DTMF, but unintended breaks in playout of a DTMF digit can confuse the receiving endpoint by creating the appearance of extra digits, receiving applications that are converting DTMF events back to tones SHOULD use the second playout algorithm rather than the first one in Section This provides some robustness against packet loss or congestion.


3.2. DTMF Events
3.2. DTMFイベント

Table 3 shows the DTMF-related event codes within the telephone-event payload format. The DTMF digits 0-9 and * and # are commonly supported. DTMF digits A through D are less frequently encountered, typically in special applications such as military networks.

表3は、電話イベントペイロードフォーマット内DTMF関連イベントコードを示しています。 DTMF数字0-9と*と#は、一般的にサポートされています。 DによるDTMFディジットAは、それほど頻繁に、一般的な軍事ネットワークなどの特殊な用途では、遭遇しています。

                    | Event | Code   | Type | Volume? |
                    | 0--9  | 0--9   | tone | yes     |
                    | *     | 10     | tone | yes     |
                    | #     | 11     | tone | yes     |
                    | A--D  | 12--15 | tone | yes     |

Table 3: DTMF Named Events


3.3. Congestion Considerations
3.3. 輻輳の考慮事項

The key considerations for the delivery of DTMF events are reliability and avoidance of unintended breaks within the playout of a given tone. End-to-end round-trip delay is not a major consideration except in the special case where DTMF tones are being used for text telephony. Assuming that, as recommended in Section 3.1 above, the second playout algorithm of Section is in use, a temporary increase in packetization interval to as much as 100 ms or double the normal interval, whichever is less, should be harmless.


4. RTP Payload Format for Telephony Tones
テレフォニートーン4. RTPペイロードフォーマット
4.1. Introduction
4.1. 前書き

As an alternative to describing tones and events by name, as described in Section 2, it is sometimes preferable to describe them by their waveform properties. In particular, recognition is faster than for naming signals since it does not depend on recognizing durations or pauses.


There is no single international standard for telephone tones such as dial tone, ringing (ringback), busy, congestion ("fast-busy"), special announcement tones, or some of the other special tones, such as payphone recognition, call waiting or record tone. However, ITU-T Recommendation E.180 [18] notes that across all countries, these tones share a number of characteristics:

そのようなダイヤルトーン、リンギング(リングバック)、忙しい、輻輳(「速いビジー」)、特別な告知音、またはそのような公衆電話認識などの他の特殊なトーンの一部として電話トーンのための単一の国際規格はありませんが、キャッチホンやレコードの音。しかし、ITU-T勧告E.180 [18]すべての国を横切って、これらのトーンは、多くの特性を共有することに注意します:

o Telephony tones consist of either a single tone, the addition of two or three tones or the modulation of two tones. (Almost all tones use two frequencies; only the Hungarian "special dial tone" has three.) Tones that are mixed have the same amplitude and do not decay.

Oテレフォニートーンは、単一のトーン、2個または3個のトーンまたは2つのトーンの変調の添加のいずれかから成ります。 (ほとんどすべてのトーンが二つの周波数を使用し、唯一のハンガリーの「特別なダイヤルトーン」は、3を持っている。)と混合されているトーンが同じ振幅を持ち、減衰しません。

o In-band tones for telephony events are in the range of 25 Hz (ringing tone in Angola) to 2600 Hz (the tone used for line signalling in SS No. 5 and R1). The in-band telephone frequency range is limited to 3400 Hz. R2 defines a 3825 Hz out-of-band tone for line signalling on analogue trunks. (The piano has a range from 27.5 to 4186 Hz.)

テレフォニーイベントのOインバンドトーン2600ヘルツ(SS番号5及びR1のラインシグナリングのために使用されるトーン)25ヘルツ(アンゴラ着信音)の範囲です。インバンド電話の周波数範囲は3400ヘルツに制限されています。 R2は、アナログトランク上のシグナリングラインの3825ヘルツの帯域外トーンを定義します。 (ピアノは27.5から4186ヘルツまでの範囲を有します。)

o Modulation frequencies range between 15 (ANSam tone) to 480 Hz (Jamaica). Non-integer frequencies are used only for frequencies of 16 2/3 and 33 1/3 Hz.

O変調周波数は、480ヘルツ(ジャマイカ)15(ANSamのトーン)の範囲です。非整数周波数はわずか16 2/3 33 1/3ヘルツの周波数のために使用されます。

o Tones that are not continuous have durations of less than four seconds.


o ITU Recommendation E.180 [18] notes that different telephone companies require a tone accuracy of between 0.5 and 1.5%. The Recommendation suggests a frequency tolerance of 1%.

O ITU勧告E.180 [18]異なる電話会社は、0.5〜1.5%の階調精度を必要とすることに留意します。勧告は、1%の周波数公差を示唆しています。

4.2. Examples of Common Telephone Tone Signals
4.2. 一般的な電話のトーン信号の例

As an aid to the implementor, Table 4 summarizes some common tones. The rows labeled "ITU ..." refer to ITU-T Recommendation E.180 [18]. In these rows, the on and off durations are suggested ranges within which local standards would set specific values. The symbol "+" in the table indicates addition of the tones, without modulation, while "*" indicates amplitude modulation.

実装者への援助として、表4は、いくつかの一般的なトーンをまとめたもの。標識された行 "ITUは..." ITU-T勧告E.180 [18]を参照します。これらの行では、オンとオフの期間は、現地の基準が特定の値を設定することになる以内の範囲を示唆しています。 「*」振幅変調を示している表中の記号「+」は、変調なしで、トーンの添加を示します。

   | Tone Name               | Frequency         | On Time  | Off Time |
   |                         |                   | (s)      | (s)      |
   | CNG                     | 1100              | 0.5      | 3.0      |
   | V.25 CT                 | 1300              | 0.5      | 2.0      |
   | CED                     | 2100              | 3.3      | --       |
   | ANS                     | 2100              | 3.3      | --       |
   | ANSam                   | 2100*15           | 3.3      | --       |
   | V.21 bit                | 980 or 1180 or    | 0.00333  | --       |
   |                         | 1650 or 1850      |          |          |
   | -------------           | ----------        | -------- | -------- |
   | ITU dial tone           | 425               | --       | --       |
   | U.S. dial tone          | 350+440           | --       | --       |
   | ITU ringing tone        | 425               | 0.67-1.5 | 3-5      |
   | U.S. ringing tone       | 440+480           | 2.0      | 4.0      |
   | ITU busy tone           | 425               | 0.1-0.6  | 0.1-0.7  |
   | U.S. busy tone          | 480+620           | 0.5      | 0.5      |
   | ITU congestion tone     | 425               | 0.1-0.6  | 0.1-0.7  |
   | U.S. congestion tone    | 480+620           | 0.25     | 0.25     |

Table 4: Examples of Telephony Tones


4.3. Use of RTP Header Fields
4.3. RTPヘッダフィールドの使用
4.3.1. Timestamp
4.3.1. タイムスタンプ

The RTP timestamp reflects the measurement point for the current packet. The event duration described in Section 4.3.3 begins at that time.

RTPタイムスタンプは、現在のパケットのための測定点を反映しています。 4.3.3節で説明したイベントの期間は、その時点で開始されます。

4.3.2. Marker Bit
4.3.2. マーカービット

The tone payload type uses the marker bit to distinguish the first RTP packet reporting a given instance of a tone from succeeding packets for that tone. The marker bit SHOULD be set to 1 for the first packet, and to 0 for all succeeding packets relating to the same tone.


4.3.3. Payload Format
4.3.3. ペイロードフォーマット

Based on the characteristics described above, this document defines an RTP payload format called "tone" that can represent tones consisting of one or more frequencies. (The corresponding media type is "audio/tone".) The default timestamp rate is 8000 Hz, but other rates may be defined. Note that the timestamp rate does not affect the interpretation of the frequency, just the durations.

上述の特性に基づいて、この文書は、1つ以上の周波数からなるトーンを表すことができる「トーン」と呼ばれるRTPペイロードフォーマットを定義します。 (対応するメディア・タイプは、「オーディオ/トーン」である。)デフォルトのタイムスタンプ・レートは8000 Hzであるが、他のレートが定義されてもよいです。ちょうど期間、タイムスタンプ率は周波数の解釈に影響を与えないことに注意してください。

In accordance with current practice, this payload format does not have a static payload type number, but uses an RTP payload type number established dynamically and out-of-band.


The payload format is shown in Figure 2.


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       |    modulation   |T|  volume   |          duration             |
       |R R R R|       frequency       |R R R R|       frequency       |
       |R R R R|       frequency       |R R R R|       frequency       |
       |R R R R|       frequency       |R R R R|      frequency        |

Figure 2: Payload Format for Tones


The payload contains the following fields:


modulation: The modulation frequency, in Hz. The field is a 9-bit unsigned integer, allowing modulation frequencies up to 511 Hz. If there is no modulation, this field has a value of zero. Note that the amplitude of modulation is not indicated in the payload and must be determined by out-of-band means.


T: If the T bit is set (one), the modulation frequency is to be divided by three. Otherwise, the modulation frequency is taken as is.


This bit allows frequencies accurate to 1/3 Hz, since modulation frequencies such as 16 2/3 Hz are in practical use.

そのような16 2/3ヘルツとして変調周波数が実用化されているので、このビットは1/3ヘルツまでの周波数が正確ことができます。

volume: The power level of the tone, expressed in dBm0 after dropping the sign, with range from 0 to -63 dBm0. (Note: A preferred level range for digital tone generators is -8 dBm0 to -3 dBm0.)

体積:トーンの電力レベルは、dBm0で0から-63の範囲で、符号を落とした後にdBm0でで発現しました。 (注:デジタル音源のための好ましいレベルの範囲は-8〜-3 dBm0でdBm0で)。

duration: The duration of the tone, measured in timestamp units and presented in network byte order. The tone begins at the instant identified by the RTP timestamp and lasts for the duration value. The value of zero is not permitted, and tones with such a duration SHOULD be ignored.


The definition of duration corresponds to that for sample-based codecs, where the timestamp represents the sampling point for the first sample.


frequency: The frequencies of the tones to be added, measured in Hz and represented as a 12-bit unsigned integer. The field size is sufficient to represent frequencies up to 4095 Hz, which exceeds the range of telephone systems. A value of zero indicates silence. A single tone can contain any number of frequencies. If no frequencies are specified, the packet reports a period of silence.


R: This field is reserved for future use. The sender MUST set it to zero, and the receiver MUST ignore it.


4.3.4. Optional Media Type Parameters
4.3.4. オプションのメディアタイプパラメータ

The "rate" parameter describes the sampling rate, in Hertz. The number is written as an integer. If omitted, the default value is 8000 Hz.

「率」パラメータはヘルツで、サンプリングレートを記述する。数は整数として書かれています。省略した場合、デフォルト値は8000 Hzです。

4.4. Procedures
4.4. 手順

This section defines the procedures associated with the tone payload type.


4.4.1. Sending Procedures
4.4.1. 送信手順

The sender MAY send an initial tones packet as soon as a tone is recognized, or MAY wait until a pre-negotiated packetization period has elapsed. The first RTP packet for a tone SHOULD have the marker bit set to 1.


In the case of longer-duration tones, the sender SHOULD generate multiple RTP packets for the same tone instance. The RTP timestamp MUST be updated for each packet generated (in contrast, for instance, to the timestamp for packets carrying telephone events). Subsequent packets for the same tone SHOULD have the marker bit set to 0, and the RTP timestamp in each subsequent packet MUST equal the sum of the timestamp and the duration in the preceding packet.

より長い持続時間トーンの場合、送信者は同じトーン、例えば、複数のRTPパケットを生成する必要があります。 RTPタイムスタンプは、(対照的に、例えば、電話イベントを運ぶパケットのタイムスタンプに)生成された各パケットのために更新されなければなりません。同じトーンのための後続のパケットは、マーカービットが0に設定されている必要があり、後続の各パケット内のRTPタイムスタンプは、前のパケットのタイムスタンプの和および持続時間を等しくなければなりません。

A final RTP packet MAY be generated as soon as the end of the tone is detected, without waiting for the latest packetization period to elapse.


The telephone-event payload described in Section 2 is inherently redundant, in that later packets for the same event carry all of the earlier history of the event except for variations in volume. In contrast, each packet for the tone payload type stands alone; a lost packet means a gap in the information available at the receiving end. Thus, for increased reliability, the sender SHOULD combine new and old tone reports in the same RTP packet using RFC 2198 [2] audio redundancy.

同じイベントのそれ以降のパケットがボリュームの変化を除いてイベントの以前の履歴の全てを運ぶセクション2に記載の電話イベントペイロードは、本質的に冗長です。これとは対照的に、トーンペイロードタイプのため、各パケットは、一人で立っています。失われたパケットは、受信端において入手可能な情報のギャップを意味します。このように、信頼性の向上のために、送信者は、RFC 2198 [2]オーディオ冗長性を使用して、同じRTPパケット内の新旧トーンレポートを結合する必要があります。

4.4.2. Receiving Procedures
4.4.2. 受信手続き

Receiving implementations play out the tones as received, typically with a playout delay to allow for lost packets. When playing out successive tone reports for the same tone (marker bit is zero, the RTP timestamp is contiguous with that of the previous RTP packet, and payload content is identical), the receiving implementation SHOULD continue the tone without change or a break.

受信した受信の実装は、一般的に失われたパケットを可能にするために、プレイアウト遅延で、トーンを再生します。 (マーカービットがゼロである、RTPタイムスタンプが前のRTPパケットのそれと連続している、とペイロード内容が同一である)同じトーンのための連続したトーンレポートを再生するとき、受信実装は変更または中断することなく、トーンを継続する必要があります。

4.4.3. Handling of Congestion
4.4.3. 輻輳の取り扱い

If the sender determines that packets are being lost due to congestion (e.g., through RTCP receiver reports), it SHOULD increase the packetization interval for initial and interim tone reports so as to reduce traffic volume to the receiver. The degree to which this is possible without causing damaging consequences at the receiving end depends both upon the playout delay used at that end and upon the specific application associated with the tones. Both the maximum packetization interval and maximum increase in packetization interval at any one time are therefore a matter of configuration or out-of-band negotiation.


5. Examples

Consider a DTMF dialling sequence, where the user dials the digits "911" and a sending gateway detects them. The first digit is 200 ms long (1600 timestamp units) and starts at time 0; the second digit lasts 250 ms (2000 timestamp units) and starts at time 880 ms (7040 timestamp units); the third digit is pressed at time 1.4 s (11,200 timestamp units) and lasts 220 ms (1760 timestamp units). The frame duration is 50 ms.

ユーザが「911」の数字をダイヤルし、送信ゲートウェイはそれらを検出したDTMFのダイヤルシーケンスを、考えてみましょう。最初の数字は、200ミリ秒の長い(1600タイムスタンプユニット)であり、時間0から始まります。第二の数字は250ミリ秒(2000タイムスタンプユニット)を持続し、時間880ミリ秒(7040のタイムスタンプユニット)から始まります。 3桁目は、時間1.4秒(11,200タイムスタンプ単位)でプレスし、220ミリ秒(1760タイムスタンプユニット)継続されます。フレーム持続時間は50ミリ秒です。

Table 5 shows the complete sequence of events assuming that only the telephone-event payload type is being reported. For simplicity: the timestamp is assumed to begin at 0, the RTP sequence number at 1, and volume settings are omitted.


   |  Time | Event     |   M  |  Time- |  Seq |  Event |  Dura- |   E  |
   |  (ms) |           |  bit |  stamp |   No |  Code  |   tion |  bit |
   |     0 | "9"       |      |        |      |        |        |      |
   |       | starts    |      |        |      |        |        |      |
   |    50 | RTP       |  "1" |      0 |    1 |    9   |    400 |  "0" |
   |       | packet 1  |      |        |      |        |        |      |
   |       | sent      |      |        |      |        |        |      |
   |   100 | RTP       |  "0" |      0 |    2 |    9   |    800 |  "0" |
   |       | packet 2  |      |        |      |        |        |      |
   |       | sent      |      |        |      |        |        |      |
   |   150 | RTP       |  "0" |      0 |    3 |    9   |   1200 |  "0" |
   |       | packet 3  |      |        |      |        |        |      |
   |       | sent      |      |        |      |        |        |      |
   |   200 | RTP       |  "0" |      0 |    4 |    9   |   1600 |  "0" |
   |       | packet 4  |      |        |      |        |        |      |
   |       | sent      |      |        |      |        |        |      |
   |   200 | "9" ends  |      |        |      |        |        |      |
   |   250 | RTP       |  "0" |      0 |    5 |    9   |   1600 |  "1" |
   |       | packet 4  |      |        |      |        |        |      |
   |       | first     |      |        |      |        |        |      |
   |       | retrans-  |      |        |      |        |        |      |
   |       | mission   |      |        |      |        |        |      |
   |   300 | RTP       |  "0" |      0 |    6 |    9   |   1600 |  "1" |
   |       | packet 4  |      |        |      |        |        |      |
   |       | second    |      |        |      |        |        |      |
   |       | retrans-  |      |        |      |        |        |      |
   |       | mission   |      |        |      |        |        |      |
   |   880 | First "1" |      |        |      |        |        |      |
   |       | starts    |      |        |      |        |        |      |
   |   930 | RTP       |  "1" |   7040 |    7 |    1   |    400 |  "0" |
   |       | packet 5  |      |        |      |        |        |      |
   |       | sent      |      |        |      |        |        |      |
   |   ... | ...       |  ... |    ... |  ... |   ...  |    ... |  ... |
   |  1130 | RTP       |  "0" |   7040 |   11 |    1   |   2000 |  "0" |
   |       | packet 9  |      |        |      |        |        |      |
   |       | sent      |      |        |      |        |        |      |
   |  1130 | First "1" |      |        |      |        |        |      |
   |       | ends      |      |        |      |        |        |      |
   |  1180 | RTP       |  "0" |   7040 |   12 |    1   |   2000 |  "1" |
   |       | packet 9  |      |        |      |        |        |      |
   |       | first     |      |        |      |        |        |      |
   |       | retrans-  |      |        |      |        |        |      |
   |       | mission   |      |        |      |        |        |      |
   |  1230 | RTP       |  "0" |   7040 |   13 |    1   |   2000 |  "1" |
   |       | packet 9  |      |        |      |        |        |      |
   |       | second    |      |        |      |        |        |      |
   |       | retrans-  |      |        |      |        |        |      |
   |       | mission   |      |        |      |        |        |      |
   |  1400 | Second    |      |        |      |        |        |      |
   |       | "1"       |      |        |      |        |        |      |
   |       | starts    |      |        |      |        |        |      |
   |  1450 | RTP       |  "1" |  11200 |   14 |    1   |    400 |  "0" |
   |       | packet 10 |      |        |      |        |        |      |
   |       | sent      |      |        |      |        |        |      |
   |   ... | ...       |  ... |    ... |  ... |   ...  |    ... |  ... |
   |  1620 | Second    |      |        |      |        |        |      |
   |       | "1" ends  |      |        |      |        |        |      |
   |  1650 | RTP       |  "0" |  11200 |   18 |    1   |   1760 |  "1" |
   |       | packet 14 |      |        |      |        |        |      |
   |       | sent      |      |        |      |        |        |      |
   |  1700 | RTP       |  "0" |  11200 |   19 |    1   |   1760 |  "1" |
   |       | packet 14 |      |        |      |        |        |      |
   |       | first     |      |        |      |        |        |      |
   |       | retrans-  |      |        |      |        |        |      |
   |       | mission   |      |        |      |        |        |      |
   |  1750 | RTP       |  "0" |  11200 |   20 |    1   |   1760 |  "1" |
   |       | packet 14 |      |        |      |        |        |      |
   |       | second    |      |        |      |        |        |      |
   |       | retrans-  |      |        |      |        |        |      |
   |       | mission   |      |        |      |        |        |      |

Table 5: Example of Event Reporting


Table 6 shows the same sequence assuming that only the tone payload type is being reported. This looks somewhat different. For simplicity: the timestamp is assumed to begin at 0, the sequence number at 1. Volume, the T bit, and the modulation frequency are omitted. The latter two are always 0.


   |  Time | Event     |  M  |  Time- |  Seq | Dura-  | Freq 1| Freq 2 |
   |  (ms) |           | bit |  stamp |   No | tion   | (Hz)  | (Hz)   |
   |     0 | "9"       |     |        |      |        |       |        |
   |       | starts    |     |        |      |        |       |        |
   |    50 | RTP       | "1" |      0 |    1 | 400    | 852   | 1477   |
   |       | packet 1  |     |        |      |        |       |        |
   |       | sent      |     |        |      |        |       |        |
   |   100 | RTP       | "0" |    400 |    2 | 400    | 852   | 1477   |
   |       | packet 2  |     |        |      |        |       |        |
   |       | sent      |     |        |      |        |       |        |
   |   ... | ...       | ... |    ... |  ... | ...    | ...   | ...    |
   |   200 | RTP       | "0" |   1200 |    4 | 400    | 852   | 1477   |
   |       | packet 4  |     |        |      |        |       |        |
   |       | sent      |     |        |      |        |       |        |
   |   200 | "9" ends  |     |        |      |        |       |        |
   |   880 | First "1" |     |        |      |        |       |        |
   |       | starts    |     |        |      |        |       |        |
   |   930 | RTP       | "1" |   7040 |    5 | 400    | 697   | 1209   |
   |       | packet 5  |     |        |      |        |       |        |
   |       | sent      |     |        |      |        |       |        |
   |   980 | RTP       | "0" |   7440 |    6 | 400    | 697   | 1209   |
   |       | packet 6  |     |        |      |        |       |        |
   |       | sent      |     |        |      |        |       |        |
   |   ... | ...       | ... |    ... |  ... | ...    | ...   | ...    |
   |  1130 | First "1" |     |        |      |        |       |        |
   |       | ends      |     |        |      |        |       |        |
   |  1400 | Second    |     |        |      |        |       |        |
   |       | "1"       |     |        |      |        |       |        |
   |       | starts    |     |        |      |        |       |        |
   |  1450 | RTP       | "1" |  11200 |   10 | 400    | 697   | 1209   |
   |       | packet 10 |     |        |      |        |       |        |
   |       | sent      |     |        |      |        |       |        |
   |   ... | ...       | ... |    ... |  ... | ...    | ...   | ...    |
   |  1620 | Second    |     |        |      |        |       |        |
   |       | "1" ends  |     |        |      |        |       |        |
   |  1650 | RTP       | "0" |  12800 |   14 | 160    | 697   | 1209   |
   |       | packet 14 |     |        |      |        |       |        |
   |       | sent      |     |        |      |        |       |        |
                    Table 6: Example of Tone Reporting

Now consider a combined payload, where the tone payload is the primary payload type and the event payload is treated as a redundant encoding (one level of redundancy). Because the primary payload is tones, the tone payload rules determine the setting of the RTP header fields. This means that the RTP timestamp always advances. As a corollary, the timestamp offset for the events payload in the RFC 2198 header increases by the same amount.

今トーンペイロードが一次ペイロードタイプであり、イベントペイロードが冗長符号化(冗長性のあるレベル)として扱われる組み合わせペイロードを考慮する。一次ペイロードがトーンであるので、トーンペイロードルールは、RTPヘッダフィールドの設定を決定します。これは、RTPタイムスタンプが常に前進することを意味します。当然の結果として、タイムスタンプが同じ量だけRFC 2198ヘッダー増大イベントペイロードのオフセット。

One issue that has to be considered in a combined payload is how to handle retransmissions of final event reports. The tone payload specification does not recommend retransmissions of final packets, so it is unclear what to put in the primary payload fields of the combined packet. In the interests of simplicity, it is suggested that the retransmitted packets copy the fields relating to the primary payload (including the RTP timestamp) from the original packet. The same principle can be applied if the packet includes multiple levels of event payload redundancy.


The figures below all illustrate "RTP packet 14" in the above tables. Figure 3 shows an event-only payload, corresponding to Table 5. Figure 4 shows a tone-only payload, corresponding to Table 6. Finally, Figure 5 shows a combined payload, with tones primary and events as a single redundant layer. Note that the combined payload has the RTP sequence numbers shown in Table 5, because the transmitted sequence includes the retransmitted packets.


Figure 3 assumes that the following SDP specification was used. This session description provides for separate streams of G.729 [21] audio and events. Packets reported within the G.729 stream are not considered here.

図3は、以下SDP仕様が使用されたことを想定しています。このセッション記述は、G.729 [21]オーディオ及びイベントの別々のストリームを提供します。 G.729ストリーム以内に報告されたパケットはここでは考慮されていません。

m=audio 12344 RTP/AVP 99 a=rtpmap:99 G729/8000 a=ptime:20 m=audio 12346 RTP/AVP 100 a=rtpmap:100 telephone-event/8000 a=fmtp:100 0-15 a=ptime:50

M =オーディオ12344 RTP / AVP 99 = rtpmap:G729 / 8000 = 99 PTIME 20、M =オーディオ12346 RTP / AVP 100 = rtpmap:100電話イベント/ 8000 =のfmtp:100 0-15 = PTIME :50

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
      | 2 |0|0|   0   |0|    100      |            18                 |
      |                           timestamp                           |
      |                             11200                             |
      |           synchronization source (SSRC) identifier            |
      |                            0x5234a8                           |
      |     event     |E R| volume    |          duration             |
      |       1       |1 0|    20     |             1760              |

Figure 3: Example RTP Packet for Event Payload


Figure 4 assumes that an SDP specification similar to that of the previous case was used.


m=audio 12344 RTP/AVP 99 a=rtpmap:99 G729/8000 a=ptime:20 m=audio 12346 RTP/AVP 101 a=rtpmap:101 tone/8000 a=ptime:50

M =オーディオ12344 RTP / AVP 99 = rtpmap:99 G729 / 8000 = PTIME 20、M =オーディオ12346 RTP / AVP 101 = rtpmap:101トーン/ 8000 = PTIME:50

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
      | 2 |0|0|   0   |0|    101      |             14                |
      |                           timestamp                           |
      |                             12800                             |
      |           synchronization source (SSRC) identifier            |
      |                            0x5234a8                           |
      |    modulation   |T|  volume   |          duration             |
      |        0        |0|    20     |             160               |
      |R R R R|       frequency       |R R R R|       frequency       |
      |0 0 0 0|          697          |0 0 0 0|         1209          |

Figure 4: Example RTP Packet for Tone Payload


Figure 5, for the combined payload, assumes the following SDP session description:


m=audio 12344 RTP/AVP 99 a=rtpmap:99 G729/8000 a=ptime:20 m=audio 12346 RTP/AVP 102 101 100 a=rtpmap:102 red/8000/1 a=fmtp:102 101/100 a=rtpmap:101 tone/8000 a=rtpmap:100 telephone-event/8000 a=fmtp:100 0-15 a=ptime:50

M =オーディオ12344 RTP / AVP 99 = rtpmap:G729 / 8000 = 99 PTIME 20、M =オーディオ12346 RTP / AVP 102 101 100 = rtpmap:102赤/ 8000/1(a)=のfmtp:102 100分の101 = rtpmap:101トーン/ 8000 = rtpmap:100電話イベント/ 8000 =のfmtp:100 0-15 = PTIME:50

For ease of presentation, Figure 5 presents the actual payloads as if they began on 32-bit boundaries. In the actual packet, they follow immediately after the end of the RFC 2198 header, and thus are displaced one octet into successive words.

彼らは、32ビット境界に始まったかのようにプレゼンテーションを容易にするため、図5は、実際のペイロードを提示します。実際のパケットでは、それらは、RFC 2198ヘッダーの終了直後に、したがって連続した単語に1つのオクテットがずれています。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
      | 2 |0|0|   0   |0|    102      |             18                |
      |                           timestamp                           |
      |                             12800                             |
      |           synchronization source (SSRC) identifier            |
      |                            0x5234a8                           |
      |F|   block PT  |  timestamp offset         |   block length    |
      |1|      100    |       1600                |        4          |
      |F|   block PT  |   event payload begins ...                    /
      |0|      101    |                                               \

Event payload


      |     event     |E R| volume    |          duration             |
      |       1       |1 0|    20     |             1760              |

Tone payload


      |    modulation   |T|  volume   |          duration             |
      |        0        |0|    20     |             160               |
      |R R R R|       frequency       |R R R R|       frequency       |
      |0 0 0 0|          697          |0 0 0 0|         1209          |

Figure 5: Example RTP Packet for Combined Tone and Event Payloads


6. Security Considerations

RTP packets using the payload formats defined in this specification are subject to the security considerations discussed in the RTP specification (RFC 3550 [5]), and any appropriate RTP profile (for example, RFC 3551 [13]). The RFC 3550 discussion focuses on requirements for confidentiality. Additional security considerations relating to implementation are described in RFC 2198 [2].

本明細書で定義されたペイロードフォーマットを使用してRTPパケットがRTP仕様で議論したセキュリティ問題を受けることである(RFC 3550 [5])、および任意の適切なRTPプロファイル(例えば、RFC 3551 [13])。 RFC 3550の議論は、機密保持のための要件に焦点を当てています。実装に関連する追加のセキュリティ上の考慮事項は、RFC 2198に記述されている[2]。

The telephone-event payload defined in this specification is highly compressed. A change in value of just one bit can result in a major change in meaning as decoded at the receiver. Thus, message integrity MUST be provided for the telephone-event payload type.


To meet the need for protection both of confidentiality and integrity, compliant implementations SHOULD implement the Secure Real-time Transport Protocol (SRTP) [7].


Note that the appropriate method of key distribution for SRTP may vary with the specific application.


In some deployments, it may be preferable to use other means to provide protection equivalent to that provided by SRTP.


Provided that gateway design includes robust, low-overhead tone generation, this payload type does not exhibit any significant non-uniformity in the receiver side computational complexity for packet processing to cause a potential denial-of-service threat.


7. IANA Considerations
7. IANAの考慮事項

This document updates the descriptions of two RTP payload formats, 'telephone-event' and 'tone', and associated Internet media types, audio/telephone-event and audio/tone. It also documents the event codes for DTMF tone events.


Within the audio/telephone-event type, events MUST be registered with IANA. Registrations are subject to the policies "Specification Required" and "Expert Review" as defined in RFC 2434 [3]. The IETF-appointed expert must ensure that:

オーディオ/電話イベントタイプの中で、イベントは、IANAに登録しなければなりません。登録はRFC 2434で定義されたポリシー「仕様が必要である」と「エキスパートレビュー」の対象となっている[3]。 IETF-任命専門家はそれを確認する必要があります。

a. the meaning and application of the proposed events are clearly documented;


b. the events cannot be represented by existing event codes, possibly with some minor modification of event definitions;


c. the number of events is the minimum necessary to fulfill the purpose of their application(s).


The expert is further responsible for providing guidance on the allocation of event codes to the proposed events. Specifically, the expert must indicate whether the event appears to be the same as one defined in RFC 2833 but not specified in any new document. In this case, the event code specified in RFC 2833 for that event SHOULD be assigned to the proposed event. Otherwise, event codes MUST be assigned from the set of available event codes listed below. If this set is exhausted, the criterion for assignment from the reserved set of event codes is to first assign those that appear to have the lowest probability of being revived in their RFC 2833 meaning in a new specification.

専門家は、提案されたイベントにイベントコードの割り当てに関するガイダンスを提供するための更なる責任があります。具体的には、専門家は、イベントが任意の新しい文書で指定されたRFC 2833で定義されたものと同じではないように見えるかどうかを示す必要があります。この場合には、そのイベントのためにRFC 2833で指定されたイベントコードが提案されているイベントに割り当てられるべきです。そうでない場合、イベントコードは、下記の利用可能なイベントコードのセットから割り当てなければなりません。このセットが枯渇した場合、イベントコードの予約されたセットからの割り当てのための基準は、最初の新しい仕様で自分のRFC 2833の意味で復活されている最も低い確率を持っているように見えるものを割り当てることです。

The documentation for each event MUST indicate whether the event is a state, tone, or other type of event (e.g., an out-of-band electrical event such as on-hook or an indication that will not itself be played out as tones at the receiving end). For tone events, the documentation MUST indicate whether the volume field is applicable or must be set to 0.


In view of the tradeoffs between the different reliability mechanisms discussed in Section 2.6, documentation of specific events SHOULD include a discussion of the appropriate design decisions for the applications of those events.


Legal event codes range from 0 to 255. The initial registry content is shown in Table 7, and consists of the sixteen events defined in Section 3 of this document. The remaining codes have the following disposition:


o codes 17-22, 50-51, 90-95, 113-120, 169, and 206-255 are available for assignment;


o codes 23-40, 49, and 52-63 are reserved for events defined in [16];


o codes 121-137 and 174-205 are reserved for events defined in [17];


o codes 16, 41-48, 64-88, 96-112, 138-168, and 170-173 are reserved in the first instance for specifications reviving the corresponding RFC 2833 events, and in the second instance for general assignment after all other codes have been assigned.

Oコード16、41-48、64から88まで、96から112まで、138から168、および170から173は、他のすべての後に一般的な割り当てに対応するRFC 2833のイベントを復活仕様については第一に、第二のインスタンスに予約されていますコードが割り当てられています。

        | Event Code | Event Name                     | Reference |
        |          0 | DTMF digit "0"                 |  RFC 4733 |
        |          1 | DTMF digit "1"                 |  RFC 4733 |
        |          2 | DTMF digit "2"                 |  RFC 4733 |
        |          3 | DTMF digit "3"                 |  RFC 4733 |
        |          4 | DTMF digit "4"                 |  RFC 4733 |
        |          5 | DTMF digit "5"                 |  RFC 4733 |
        |          6 | DTMF digit "6"                 |  RFC 4733 |
        |          7 | DTMF digit "7"                 |  RFC 4733 |
        |          8 | DTMF digit "8"                 |  RFC 4733 |
        |          9 | DTMF digit "9"                 |  RFC 4733 |
        |         10 | DTMF digit "*"                 |  RFC 4733 |
        |         11 | DTMF digit "#"                 |  RFC 4733 |
        |         12 | DTMF digit "A"                 |  RFC 4733 |
        |         13 | DTMF digit "B"                 |  RFC 4733 |
        |         14 | DTMF digit "C"                 |  RFC 4733 |
        |         15 | DTMF digit "D"                 |  RFC 4733 |

Table 7: audio/telephone-event Event Code Registry


7.1. Media Type Registrations
7.1. メディアタイプ登録
7.1.1. Registration of Media Type audio/telephone-event
7.1.1. メディアタイプオーディオ/電話イベントの登録

This registration is done in accordance with [6] and [8].


Type name: audio


Subtype name: telephone-event


Required parameters: none.


Optional parameters:


The "events" parameter lists the events supported by the implementation. Events are listed as one or more comma-separated elements. Each element can be either a single integer providing the value of an event code or an integer followed by a hyphen and a larger integer, presenting a range of consecutive event code values. The list does not have to be sorted. No white space is allowed in the argument. The union of all of the individual event codes and event code ranges designates the complete set of event numbers supported by the implementation. If the "events" parameter is omitted, support for events 0-15 (the DTMF tones) is assumed.

「イベント」パラメータは、実装によってサポートされるイベントを示しています。イベントは、一つ以上のコンマで区切られた要素として記載されています。各要素は、連続するイベントコード値の範囲を示す、イベントコード、またはハイフンに続く整数および整数の値を提供する単一の整数のいずれかであり得ます。リストをソートする必要はありません。いかなるホワイトスペースを引数に許可されていません。個々のイベントコードとイベントコード範囲の全ての労働組合は、実装によってサポートされているイベント番号の完全なセットを指定します。 「イベント」パラメータが省略された場合、イベント0-15(DTMFトーン)のサポートが想定されます。

The "rate" parameter describes the sampling rate, in Hertz. The number is written as an integer. If omitted, the default value is 8000 Hz.

「率」パラメータはヘルツで、サンプリングレートを記述する。数は整数として書かれています。省略した場合、デフォルト値は8000 Hzです。

Encoding considerations:


In the terminology defined by [8] section 4.8, this type is framed and binary.


Security considerations:


See Section 6, "Security Considerations", in this document.


Interoperability considerations: none.


Published specification: this document.


Applications which use this media:


The telephone-event audio subtype supports the transport of events occurring in telephone systems over the Internet.


Additional information:


Magic number(s): N/A. File extension(s): N/A. Macintosh file type code(s): N/A.

マジックナンバー(S):N / A。ファイルの拡張子(S):N / A。 Macintoshのファイルタイプコード(S):N / A。

Person & email address to contact for further information:


Tom Taylor, IETF AVT Working Group.

トム・テイラー、。 IETF AVTワーキンググループ。

Intended usage: COMMON.


Restrictions on usage:


This type is defined only for transfer via RTP [5].

このタイプのみRTP [5]を介して転送するために定義されています。

Author: IETF Audio/Video Transport Working Group.


Change controller:


IETF Audio/Video Transport Working Group as delegated from the IESG.


7.1.2. Registration of Media Type audio/tone
7.1.2. メディアタイプのオーディオ/トーンの登録

This registration is done in accordance with [6] and [8].


Type name: audio


Subtype name: tone


Required parameters: none


Optional parameters:


The "rate" parameter describes the sampling rate, in Hertz. The number is written as an integer. If omitted, the default value is 8000 Hz.

「率」パラメータはヘルツで、サンプリングレートを記述する。数は整数として書かれています。省略した場合、デフォルト値は8000 Hzです。

Encoding considerations:


In the terminology defined by [8] section 4.8, this type is framed and binary.


Security considerations:


See Section 6, "Security Considerations", in this document.


Interoperability considerations: none


Published specification: this document.


Applications which use this media:


The tone audio subtype supports the transport of pure composite tones, for example, those commonly used in the current telephone system to signal call progress.


Additional information:


Magic number(s): N/A. File extension(s): N/A. Macintosh file type code(s): N/A.

マジックナンバー(S):N / A。ファイルの拡張子(S):N / A。 Macintoshのファイルタイプコード(S):N / A。

Person & email address to contact for further information:


Tom Taylor, IETF AVT Working Group.

トム・テイラー、。 IETF AVTワーキンググループ。

Intended usage: COMMON.


Restrictions on usage:


This type is defined only for transfer via RTP [5].

このタイプのみRTP [5]を介して転送するために定義されています。

Author: IETF Audio/Video Transport Working Group.


Change controller:


IETF Audio/Video Transport Working Group as delegated from the IESG.


8. Acknowledgements

Scott Petrack was the original author of RFC 2833. Henning Schulzrinne later loaned his expertise to complete the document, but Scott must be credited with the energy behind the idea of a compact encoding of tones over IP.

スコット2000 Petrackとは、RFC 2833ヘニングSchulzrinneとの原作者は、後で文書を完了するために彼の専門知識を貸与されましたが、スコットはIP上のトーンのコンパクトなエンコーディングの考え方の背後にあるエネルギーと信じなければなりません。

In RFC 2833, the suggestions of the Megaco working group were acknowledged. Colin Perkins and Magnus Westerland, Chairs of the AVT Working Group, provided helpful advice in the formation of the present document. Over the years, detailed advice and comments for RFC 2833, this document, or both were provided by Hisham Abdelhamid, Flemming Andreasen, Fred Burg, Steve Casner, Dan Deliberato, Fatih Erdin, Bill Foster, Mike Fox, Mehryar Garakani, Gunnar Hellstrom, Rajesh Kumar, Terry Lyons, Steve Magnell, Zarko Markov, Tim Melanchuk, Kai Miao, Satish Mundra, Kevin Noll, Vern Paxson, Oren Peleg, Raghavendra Prabhu, Moshe Samoha, Todd Sherer, Adrian Soncodi, Yaakov Stein, Mira Stevanovic, Alex Urquizo, and Herb Wildfeur.

RFC 2833では、Megacoのワーキンググループの提案は認めました。コリンパーキンスとマグヌスウェスター、AVT作業部会の議長は、現在のドキュメントの形成に有益な助言を提供します。長年にわたり、RFC 2833、本書のためにアドバイスやコメントを詳細に説明、またはその両方がヒシャムAbdelhamid、フレミングAndreasenの、フレッドブルク、スティーブCasner、ダンDeliberato、ファティErdin、ビル・フォスター、マイク・フォックス、Mehryar Garakani、グンナー・ヘルストロームによって提供されました、ラジェッシュクマー、テリー・ライオンズ、スティーブMagnell、Zarkoマルコフ、ティムMelanchuk、甲斐ミャオ族、サティシュムンドラ、ケビン・ノル、バーン・パクソン、オレンペレグ、Raghavendraプラブー、モシェSamoha、トッド・シェラー、エイドリアンSoncodi、Yaakovのスタイン、ミラStevanovicの、アレックスUrquizo 、そしてハーブWildfeur。

9. References
9.1. Normative References
9.1. 引用規格

[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[1]ブラドナーのは、S.は、BCP 14、RFC 2119、1997年3月の "RFCsにおける使用のためのレベルを示すために"。

[2] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, September 1997.

[2]パーキンス、C.、Kouvelas、I.、ホドソン、O.、ハードマン、V.、ハンドレー、M.、Bolot、J.、ベガ・ガルシア、A.、およびS.フォッシー-Parisis、「RTPペイロード冗長オーディオ・データ」、RFC 2198、1997年9月のため。

[3] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998.

[3] Narten氏、T.とH. Alvestrand、 "RFCsにIANA問題部に書くためのガイドライン"、BCP 26、RFC 2434、1998年10月。

[4] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.

[4]ローゼンバーグ、J.、およびH. Schulzrinneと、RFC 3264 "セッション記述プロトコル(SDP)とのオファー/アンサーモデル" 2002年6月。

[5] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[5] Schulzrinneと、H.、Casner、S.、フレデリック、R.、およびV.ヤコブソン、 "RTP:リアルタイムアプリケーションのためのトランスポートプロトコル"、STD 64、RFC 3550、2003年7月。

[6] Casner, S. and P. Hoschka, "MIME Type Registration of RTP Payload Formats", RFC 3555, July 2003.

[6] Casner、S.とP. Hoschka、 "RTPペイロード形式のMIMEタイプ登録"、RFC 3555、2003年7月。

[7] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004.

[7] Baugher、M.、マグリュー、D.、Naslund、M.、カララ、E.、およびK. Norrman、 "セキュアリアルタイム転送プロトコル(SRTP)"、RFC 3711、2004年3月。

[8] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", BCP 13, RFC 4288, December 2005.

[8]フリード、N.とJ. Klensin、 "メディアタイプの仕様と登録手順"、BCP 13、RFC 4288、2005年12月。

[9] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.

[9]ハンドリー、M.、ヤコブソン、V.、およびC.パーキンス、 "SDP:セッション記述プロトコル"、RFC 4566、2006年7月。

[10] International Telecommunication Union, "Technical features of push-button telephone sets", ITU-T Recommendation Q.23, November 1988.

[10]国際電気通信連合、 "プッシュボタン電話機の技術的特徴"、ITU-T勧告Q.23、1988年11月。

[11] International Telecommunication Union, "Multifrequency push-button signal reception", ITU-T Recommendation Q.24, November 1988.

[11]国際電気通信連合、 "多周波プッシュボタン信号受信"、ITU-T勧告Q.24、1988年11月。

9.2. Informative References
9.2. 参考文献

[12] Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals", RFC 2833, May 2000.

[12] Schulzrinneと、H.とS. 2000 Petrackと、 "DTMFケタ、電話トーン、および電話信号のためのRTPペイロード"、RFC 2833、2000年5月。

[13] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.

[13] Schulzrinneと、H.とS. Casner、 "最小量のコントロールがあるオーディオとビデオ会議システムのためのRTPプロフィール"、STD 65、RFC 3551、2003年7月。

[14] Kreuter, R., "RTP Payload Format for a 64 kbit/s Transparent Call", RFC 4040, April 2005.

[14] Kreuter、R.、 "64kビット/ sの透明コールのためのRTPペイロードフォーマット"、RFC 4040、2005年4月。

[15] Hellstrom, G. and P. Jones, "RTP Payload for Text Conversation", RFC 4103, June 2005.

[15]ヘルストローム、G.とP.ジョーンズ、 "テキストの会話のためのRTPペイロード"、RFC 4103、2005年6月。

[16] Schulzrinne, H. and T. Taylor, "Definition of Events for Modem, Fax, and Text Telephony Signals", RFC 4734, December 2006.

[16] Schulzrinneと、H.、およびT.テイラー、RFC 4734、2006年12月 "モデム、ファックス、およびテキストテレフォニーシグナルのためのイベントの定義"。

[17] Schulzrinne, H. and T. Taylor, "Definition of Events For Channel-Oriented Telephony Signalling", Work In Progress , November 2005.

[17] Schulzrinneと、H.、およびT.テイラー、「チャネル指向テレフォニーシグナリングのためのイベントの定義」、進歩、2005年11月の作業。

[18] International Telecommunication Union, "Technical characteristics of tones for the telephone service", ITU-T Recommendation E.180/Q.35, March 1998.

[18]国際電気通信連合、 "電話サービスのためのトーンの技術的特性"、ITU-T勧告E.180 / Q.35、1998年3月。

[19] International Telecommunication Union, "Pulse code modulation (PCM) of voice frequencies", ITU-T Recommendation G.711, November 1988.

、ITU-T勧告G.711、1988年11月 "音声周波数のパルス符号変調(PCM)" [19]国際電気通信連合、。

[20] International Telecommunication Union, "Speech coders : Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s", ITU-T Recommendation G.723.1, March 1996.


[21] International Telecommunication Union, "Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP)", ITU-T Recommendation G.729, March 1996.

[21]国際電気通信連合、 "8キロビットにおける音声の符号化/使用sの共役構造代数符号励振線形予測(CS-ACELP)"、ITU-T勧告G.729、1996年3月。

[22] International Telecommunication Union, "ISDN user-network interface layer 3 specification for basic call control", ITU-T Recommendation Q.931, May 1998.

[22]国際電気通信連合を、 "基本的な呼制御のためのISDNユーザ・網インタフェースレイヤ3仕様"、ITU-T勧告Q.931、1998年5月。

[23] International Telecommunication Union, "Procedures for real-time Group 3 facsimile communication over IP networks", ITU-T Recommendation T.38, July 2003.

[23]国際電気通信連合、ITU-T勧告T.38、2003年7月 "IPネットワーク上のリアルタイムグループ3ファクシミリ通信のための手順"。

[24] International Telecommunication Union, "Procedures for starting sessions of data transmission over the public switched telephone network", ITU-T Recommendation V.8, November 2000.

[24]国際電気通信連合、2000年11月、ITU-T勧告V. 8「のパブリック上でデータ伝送のセッションを開始するための手順は、交換電話網」。

[25] International Telecommunication Union, "Modem-over-IP networks: Procedures for the end-to-end connection of V-series DCEs", ITU-T Recommendation V.150.1, January 2003.

[25]国際電気通信連合、 "モデムオーバーIPネットワーク:VシリーズのDCEのエンドツーエンド接続のための手順"、ITU-T勧告V.150.1、2003年1月。

[26] International Telecommunication Union, "Procedures for supporting Voice-Band Data over IP Networks", ITU-T Recommendation V.152, January 2005.

[26]国際電気通信連合、 "IPネットワーク上で音声帯域データをサポートするための手順"、ITU-T勧告V.152、2005年1月。

[27] International Telecommunication Union, "Operational and interworking requirements for {DCEs operating in the text telephone mode", ITU-T Recommendation V.18, November 2000.


See also Recommendation V.18 Amendment 1, Nov. 2002. [28] VOIP Troubleshooter LLC, "Indepth: Packet Loss Burstiness", 2005, <>.

2005年、<>:勧告V.18改正1、11月、2002年[28] VOIPトラブルシュータLLC、 "パケット損失バースト性のIndepth" をも参照してください。

Appendix A. Summary of Changes from


The memo has been significantly restructured, incorporating a large number of clarifications to the specification. With the exception of those items noted below, the changes to the memo are intended to be backwards-compatible clarifications. However, due to inconsistencies and unclear definitions in RFC 2833 [12] it is likely that some implementations interpreted that memo in ways that differ from this version.

メモが大幅に仕様に明確化の多数を組み込んだ、再構築されています。以下に述べるこれらの項目を除いて、メモに対する変更は、下位互換性の明確化であることが意図されます。しかしながら、[12] RFC 2833で不一致と不明確な定義には、いくつかの実装が、このバージョンは異なる方法でそのメモを解釈する可能性があります。

RFC 2833 required that all implementations be capable of receiving the DTMF events (event codes 0-15). Section of the present document requires that a sender transmit only the events that the receiver is capable of receiving. In the absence of a knowledge of receiver capabilities, the sender SHOULD assume support of the DTMF events but of no other events. The sender SHOULD indicate what events it can send. Section requires that a receiver signalling its capabilities using SDP MUST indicate which events it can receive.

RFC 2833は、すべての実装がDTMFイベント(イベントコード0-15)を受信できることが必要。本文書のセクション2.5.1.1は、送信側が受信側が受信可能なイベントだけを送信することを要求します。受信機の機能についての知識がない場合には、送信者は、DTMFイベントのが、他の事象なしのサポートを前提とすべきです。送信者は、それが送ることができますどのようなイベントを示す必要があります。セクション2.5.2.1は、SDPを使用してその機能をシグナル受信機は、それが受信できるイベントを示している必要があります。

Non-zero values in the volume field of the payload were applicable only to DTMF tones in RFC 2833, and for other events the receiver was required to ignore them. The present memo requires that the definition of each event indicate whether the volume field is applicable to that event. The last paragraph of Section indicates what a receiver may do if it receives volumes with zero values for events to which the volume field is applicable. Along with the RFC 2833 receiver rule, this ensures backward compatibility in both directions of transmission.

ペイロードのボリュームフィールドの非ゼロ値は、RFC 2833にのみDTMFトーンに適用した、および他のイベントのために受信機はそれらを無視する必要がありました。本メモは、各イベントの定義がボリュームフィールドは、そのイベントに適用可能であるかどうかを示すことが必要です。セクション2.5.2.2の最後の段落は、それがボリュームフィールドが適用されるイベントのためにゼロ値でボリュームを受信した場合、受信機が行うかもしれないものを示しています。 RFC 2833の受信ルールと共に、この伝送の両方の方向に後方互換性を保証します。

Section and Section introduce a new procedure for reporting and playing out events whose duration exceeds the capacity of the payload duration field. This procedure may cause momentary confusion at an old (RFC 2833) receiver, because the timestamp is updated without setting the E bit of the preceding event report and without setting the M bit of the new one.

セクション2.5.1.3と2.5.2.3節は、報告及びその期間ペイロードデュレーションフィールドの容量を超えるイベントを再生するための新しい手順を紹介します。タイムスタンプは、前のイベントレポートのEビットを設定せずに、新しいもののMビットを設定せずに更新されるため、この手順は、古い(RFC 2833)受信機で瞬間的な混乱を引き起こす可能性があります。

Section and Section introduce a new procedure whereby a sequence of short-duration events may be packed into a single event report. If an old (RFC 2833) receiver receives such a report, it may discard the packet as invalid, since the packet holds more content than the receiver was expecting. In any event, the additional events in the packet will be lost.

セクション2.5.1.5と2.5.2.4節は、短期のイベントのシーケンスは、単一のイベントレポートにパックすることができることによって、新たな手順を紹介します。古い(RFC 2833)受信機は、このような報告を受けた場合は、パケットが受信機は期待していたよりも多くのコンテンツを保持しているため、それは、無効としてパケットを破棄してもよいです。いずれにせよ、パケット内の追加のイベントが失われます。

Section 2.3.5 introduces the possibility of "state" events and defines procedures for setting the duration field for reports of such events. Section defines special exemptions from the setting of the E bit for state events. Three more sections mention procedures related to these events.


The Security Considerations section is updated to mention the requirement for protection of integrity. More importantly, it makes implementation of SRTP [7] mandatory for compliant implementations, without specifying a mandatory-to-implement method of key distribution.

Security Considerations部は、整合性の保護のための要件を言及するように更新されます。より重要なことは、鍵配布の強制的に実装方法を指定せずに、対応する実装のための[7]必須SRTPの実装を行います。

Finally, this document establishes an IANA registry for event codes and establishes criteria for their documentation. This document provides an initial population for the new registry, consisting solely of the sixteen DTMF events. Two companion documents [16] and [17] describe events related to modems, fax, and text telephony and to channel-associated telephony signalling, respectively. Some changes were made to the latter because of errors and redundancies in the RFC 2833 assignments. The remaining events defined in RFC 2833 are deprecated because they do not appear to have been implemented, but their codes have been conditionally reserved in case any of them is needed in the future. Table 8 indicates the disposition of the event codes in detail. Event codes not mentioned in this table were not allocated by RFC 2833 and continue to be unused.

最後に、この文書では、イベントコードのためのIANAレジストリを確立し、そのドキュメントのための基準を設定しています。この文書は、単に16 DTMFイベントで構成される、新しいレジストリの初期集団を提供します。二つの仲間ドキュメント[16]と[17]モデム、ファックス、およびテキスト電話に関連するイベントを記述し、それぞれ、電話シグナリ​​ングを、関連するチャネルします。いくつかの変更があるため、RFC 2833個の割り当てのエラーと冗長性の後半に行われました。 RFC 2833で定義された残りのイベントは、彼らが実装されているようには思われないため廃止されていますが、そのコードは条件付きでそれらのいずれかが、将来的に必要とされる場合には予​​約されています。表8は、詳細イベントコードの配置を示しています。この表に記載されていないイベントコードは、RFC 2833によって割り当てられ、未使用であり続けていませんでした。

   | Event Codes | RFC 2833 Description                  | Disposition |
   |        0-15 | DTMF digits                           | RFC 4733    |
   |          16 | Line flash (deprecated)               | Reserved    |
   |       23-31 | Unused                                | [16]        |
   |       32-40 | Data and fax                          | [16]        |
   |       41-48 | Data and fax (V.8bis, deprecated)     | Reserved    |
   |       52-63 | Unused                                | [16]        |
   |       64-89 | E.182 line events (deprecated)        | Reserved    |
   |      96-112 | Country-specific line events          | Reserved    |
   |             | (deprecated)                          |             |
   |     121-127 | Unused                                | [17]        |
   |     128-137 | Trunks: MF 0-9                        | [17]        |
   |     138-143 | Trunks: other MF (deprecated)         | Reserved    |
   |     144-159 | Trunks: ABCD signalling               | [17]        |
   |     160-168 | Trunks: various (deprecated)          | Reserved    |
   |     170-173 | Trunks: various (deprecated)          | Reserved    |
   |     174-205 | Unused                                | [17]        |

Table 8: Disposition of RFC 2833-defined Event Codes

表8:RFC 2833に定義されたイベントコードの処分

Authors' Addresses


Henning Schulzrinne Columbia U. Dept. of Computer Science Columbia University 1214 Amsterdam Avenue New York, NY 10027 US

コンピュータサイエンスコロンビア大学1214アムステルダムAvenueニューヨークのヘニングSchulzrinneとコロンビアU.部長、NY 10027米国



Tom Taylor Nortel 1852 Lorraine Ave Ottawa, Ontario K1H 6Z8 Canada

トムテイラーノーテル1852ロレーヌアヴェオタワ、オンタリオ州K1H 6Z8カナダ



Full Copyright Statement


Copyright (C) The IETF Trust (2006).


This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

この文書では、BCP 78に含まれる権利と許可と制限の適用を受けており、その中の記載を除いて、作者は彼らのすべての権利を保有します。



Intellectual Property


The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

IETFは、本書またはそのような権限下で、ライセンスがたりないかもしれない程度に記載された技術の実装や使用に関係すると主張される可能性があります任意の知的財産権やその他の権利の有効性または範囲に関していかなる位置を取りません利用可能です。またそれは、それがどのような権利を確認する独自の取り組みを行ったことを示すものでもありません。 RFC文書の権利に関する手続きの情報は、BCP 78およびBCP 79に記載されています。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at


The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at

IETFは、その注意にこの標準を実装するために必要とされる技術をカバーすることができる任意の著作権、特許または特許出願、またはその他の所有権を持ってすべての利害関係者を招待します。 ietf-ipr@ietf.orgのIETFに情報を記述してください。



Funding for the RFC Editor function is currently provided by the Internet Society.

RFC Editor機能のための基金は現在、インターネット協会によって提供されます。