3. The Well-Behaved User Agent The term "user agent" comprises posting agents, reading agents and followup agents as defined in [USEFOR], and also reply agents, by which is meant a user agent that is generating an email, presumably addressed to the poster of an article. Although it is usual for all these functionalities to be included within a single piece of software, it is convenient to discuss them separately here. This section is addressed primarily to the implementors of user agents. Whilst it is common for such agents to combine the functions of a Netnews User Agent (NUA) and a Mail User Agent (MUA), it needs to be realized that they serve different functions, and adding a few extra features to an MUA is unlikely to result in a good NUA, any more than adding a few extra features to an NUA would result in a good MUA. 3.1. The Well-Behaved Posting Agent The implementor of a posting agent SHOULD make it possible for a suitably perseverent poster to generate any article, however absurd, that conforms strictly to [USEFOR]. On the other hand, it needs to be understood that the difference between a good posting agent and a bad posting agent lies in its ability to encourage the poster to adhere to good standards of "netkeeping", by making it easy to generate articles that will be widely acceptable to the conventions and expectations of the Usenet community, and hard to generate articles outside of those norms. This is largely a matter of choosing appropriate defaults for various parameters and settings. Here it shold be noted that what is acceptable in Email (which is a one-to-few communication where the author can be expected to be aware of the capabilities and preferences of his correspondents) may not be acceptable in Netnews (which is a one-to-many communication directed at an unseen and unknown audience). Much grief has arisen in the past from poorly designed agents which tried to imppose onto Usenet defaults and practices which were perfectly appropriate for Email. 3.1.1. Construction of Headers Whilst it SHOULD be possible to insert any legitimate header, not limited to those defined in [USEFOR] and including experimental headers, there are certain essential headers, namely the Subject-, Newsgroups-, Followup-To- and Reply-To-headers which the poster MUST be able to insert and/or edit (and to do so at any stage during the composition of the article). Note that this specifically includes the possibility of setting the followup to "poster". NOTE: This does not mean that header must be presented for editing in the exact form specified in [USEFOR]; a graphical interface for editing the various header contents would suffice, although it would be useful if that included the facility for the poster to determine the folding. Posting agents SHOULD permit the poster to include headers of arbitrary length (and MUST permit at least 79 characters). However, they SHOULD endeavour to keep individual header lines, so far as is possible, within 79 characters (or other established policy limit) by folding them at suitable places (however, the limit of 998 octets ([USEFOR] 4.5) on any individual header line still applies); but if the poster has manually folded a header within the accepted limits (to achieve some pleasing layout, for example) the posting agent SHOULD respect the poster's intent. Although header-contents are defined in such a way that folding ([USEFOR] 4.2.3) can take place between many of the lexical tokens (and even within some of them), folding SHOULD be limited to placing the CRLF at higher-level syntactic breaks, and SHOULD also avoid leaving trailing WSP on the preceding line. For instance, if a header-content is defined as comma-separated values, it is RECOMMENDED that folding occur after the comma separating the values, even if it is allowed elsewhere. There is a preferred case convention, which posters and posting agents SHOULD use: each hyphen-separated "word" has its initial letter (if any) in uppercase and the rest in lowercase, except that some abbreviations have all letters uppercase (e.g. "Message-ID" and "MIME-Version"). The forms given in the various rules defining headers in [USEFOR] show the preferred forms (but relaying and reading agents are expected to tolerate articles not obeying this convention). A comment ([USEFOR] 4.2.4) is normally used to provide some human readable informational text, except at the end of a mailbox which contains no phrase, as in fred@foo.bar.example (Fred Bloggs) as opposed to "Fred Bloggs" . The former is a deprecated, but commonly encountered, usage for indicating the name of the person whose mailbox it is. Posting agents SHOULD NOT now be generating it. Headers that merely state defaults explicitly (e.g., a Followup-To- header with the same content as the Newsgroups-header, or a MIME Content-Type-header with contents "text/plain; charset=us-ascii"), or state information that reading agents can typically determine easily themselves (e.g. the length of the body in octets) are redundant and posting agents SHOULD NOT include them. There follow some recommendations specific to particular headers. 3.1.1.1. Date It is RECOMMENDED to add a comment, after the date-time, containing the time zone in human-readable form. However, many of the abbreviations commonly used for this purpose are ambiguous, and so the value given by the is the only definitive form. For example: Date: Sat, 26 May 2001 11:13:00 -0500 (EST) 3.1.1.2. From The mailboxes in the From-content MUST contain syntactically valid email addresses identifying the poster(s). Each such mailbox SHOULD be a working email address, belonging to the poster(s) of the article, or the person or agent on whose behalf the article is posted. When, for whatever reason, a poster does not wish ro use a working address, the mailbox concerned SHOULD, to comply with [USEFOR], end in the top level domain ".invalid" [RFC 2606]. NOTE: It is fashionable for posters to disguise their mail addresses to discourage malicious harvesting and for other purposes. Whilst the circumstances which might make this seem desirable are much to be regretted, the practice cannot be regarded as in the best interests of Usenet, and this document does not seek to promote the practice, even though it shows how to do it "correctly". Therefore, it is NOT recommended that implementors should go out of their way to facilitate it. 3.1.1.3. Message-ID Posting agents have the option of generating their own message identifiers, or of leaving it to the injecting agent. Recall that it is an absolute requirement of [USEFOR] that message identifiers should be unique with regard to all other Netnews articles or Email messages, past, present or future. However, it would in practice be sufficient to ensure that there were astronomical odds against a duplicated message identifier, and this is usually brought about by using the domain name of the originating site in the id-right of the msg-id, together with the time of composition and other disambiguating material (such as a process number or a serial number) in the id-left. It is also in order to include additional information of significance to the poster within the id-left, and even to deliberately make a non-unique identifier in cases where the identical message is to be posted by several posters (for example, a cancel for an article which may also be cancelled by others). [Recall that we have two drafts regarding the construction of message identifiers on www.landfield.com/usefor that were written in the early days of Usefor. Maybe these should be dusted down, published, and referred to here.] 3.1.1.4. Subject There is a temptation amongst inventors of new protocols to require particular phrases to be inserted or regognized automatically at particular places within the Subject-header. This temptation is strongly to be resisted. There are, however, two exceptions to this principle which have become hallowed by longstanding usage: 1. There is an established convention for the Subject-header in a followup to begin with "Re: ", and this SHOULD be supported (see 3.2.1.1). 2. For compatibility with legacy news software, the Subject-content of a control message (i.e. an article that also contains a Control-header) MAY start with the string "cmsg ", and non-control messages SHOULD NOT start with the string "cmsg ". See also section 6.1. [SHOULD NOT changed from MUST NOT? Do there really still exist servers or other agents that will recognize and act upon "cmsg" in a Subject- header? And if so, maybe that MUST NOT should be moved back into [USEFOR].] Subject-headers are for humans to read, and the most that user agents should do is to filter them as directed by their human readers. If some enhancement to Netnews requires support within the headers, then the proper procedure is to invent a new header for the purpose, or to adapt an existing header (supposing it had the capability to support such adaptations). 3.1.1.5. Newsgroups There are restrictions on the length of components of newsgroup- names, and on the newsgroup-names themselves, as described more fully in 7.2. Posting and injecting agents MAY attempt to enforce them but, because of the possibility that hierarchy policies or future standards may relax them, it SHOULD be possible for posters to override such checks, and software MUST be so written that they can be disabled altogether. Posting agents MAY (and followup agents SHOULD) accept articles crossposted to newsgroups which do not exist on their local hosts, though posting agents SHOULD at least alert the poster to the situation and request confirmation. 3.1.1.6. Reply-To In the absence of Reply-To, the reply address(es) is the address(es) in the From-header. For this reason a Reply-To SHOULD NOT be included if it just duplicates the From-header. NOTE: Use of a Reply-To-header is preferable to including a similar request in the article body, because replying agents can take account of Reply-To automatically. 3.1.1.7. Organization Posting agents are discouraged from providing a default value for this header unless it is acceptable to all posters using those agents and unless it contains useful information (including some indication of the poster's physical environment). See section 4.1.2 for an even stronger discouragement for injecting agents. 3.1.1.8. Distribution Posting agents SHOULD NOT provide a default Distribution-header without giving the poster an opportunity to override it. 3.1.1.9. Followup-To A Followup-To-header SHOULD NOT be included if it just duplicates the Newsgroups-header. At least one of its newsgroup-names SHOULD exist on the posting agent's host (since a well behaved poster ought not to be setting followups to a place that he cannot read). Cf. a similar rule regarding crossposting in [USEFOR] section 5.5. 3.1.1.10. User-Agent Comments in User-Agent-headers should be restricted to information regarding the product named to their left, such as its full name or platform information, and should be concise. Use as an advertising medium (in the mundane sense) is discouraged. 3.1.2. Construction of Bodies It was the fashion at one time to indicate underlining within body texts using Backspace, in the form of an underscore (US-ASCII 95), a backspace, and a character, repeated for each character that should be underlined. Posting agents MAY support this mechanism, although it is no longer so common for reading agents to process it. NOTE: using this precise method should ensure that reading agents that cannot display the text underlined will at least display it correctly in an un-underlined form. The formfeed character (US-ASCII 12) (which is sometimes referred to as the "spoiler character") MAY be used (see 3.3.3 for its effect on reading agents). In plain-text articles (those with no MIME headers, or those with a MIME Content-Type of "text/plain") posting agents SHOULD endeavour to keep the length of body lines within some reasonable limit. The size of this limit is a matter of policy, the default being to keep within 79 characters at most, and preferably within 72 characters (to allow room for quoting in followups). Except where "format=flowed" is being used (3.1.2.2), the line breaks shown to the poster during editing SHOULD be exactly as they will appear in the posted article. NOTE: That policy limit (e.g. 72 or 79) should be expressed as a number of characters (as they will be displayed by a reading agent) rather than as the number of octets used to encode them. For use on occasions where established policy prescribes different line lengths (this usually arises in groups where the charset for the language used is best represented using double width characters) the preferred line length SHOULD be a configurable option. Posting agents MUST permit the poster to create individual lines longer than the default or configured length if he so insists (which may require the cessation of any automatic generation of flowed lines [RFC 3676] on a temporary basis), but it SHOULD be made apparent to the poster (e.g. by issuing a warning) that the article contains lines longer that the customary length. If the software uses an external editor, the editor called by default SHOULD be able to meet all the requirements of this section. 3.1.2.1. Signatures A "personal signature" is a short closing text automatically added to the end of articles by posting agents, identifying the poster and giving his network addresses, etc. Whenever a poster or posting agent appends such a signature to an article, it MUST be preceded with a delimiter line containing (only) two hyphens (US-ASCII 45) followed by one SP (US-ASCII 32). The signature is considered to extend from the last occurrence of that delimiter up to the end of the article (or up to the end of the part in the case of a multipart MIME body). Posting agents SHOULD provide a facility to enable the poster to add such signatures, and SHOULD discourage (at least with a warning) signatures of excessive length (4 lines is a commonly accepted limit). 3.1.2.2. Usage of MIME When the Content-Type is "text/plain", the recommendations and limits on line lengths set out above SHOULD be observed. Posting agents MAY use the "format=flowed" parameter of "text/plain" (and also the "DelSp=yes" if appropriate) defined in [RFC 3676] so as to allow suitably equipped reading agents to reformat flowed paragraphs to suit the width of their display areas. However, it must be understood that many reading agents do not support that feature, and therefore the physical length of all lines SHOULD be restricted to the default preferred length of 72 characters, rather than the 78 recommended in [RFC 3676]. However, single words longer than that length (and this specifically applies to URIs [RFC 2396]) which MUST NEVER be split across more than one physical line. Other forms of text, such as "text/html" SHOULD NOT be used except in groups where established policy or custom so allows (7.3). However, where they are so used then, for the benefit of readers who see it only in its transmitted form, the material SHOULD be "pretty-printed" (for example by restricting its line length as above and by keeping sequences which control its layout or style separate from the meaningful text). Likewise, Content-Types requiring special processing for their display, notably the "binary" Content-Types "image", "audio" and "video" (including also material encoded by the "uuencode" protocol), together with most "application" types, SHOULD NOT be used except in groups where established policy or custom so allows (7.3). Exceptionally, those application types defined in [RFC 1847] and [RFC 3156] for use within "multipart/signed" articles, and the type "application/pgp-keys" (or other similar types containing digital certificates) may be used freely. The Content-Type "message/partial" is not recommended for textual articles because the Content-Type, and in particular the charset, of the complete article cannot be determined by examination of the second and subsequent parts, and hence (except when they are written in pure US-ASCII) it is not possible to read them as separate articles (as by a reader who wanted to "browse ahead" to see whether it was worth his while to read the whole set). Moreover, for full compliance with [RFC 2046] it would be necessary to use the "quoted- printable" encoding to ensure the material was 7bit-safe. In any case, breaking such long texts into several parts is usually unnecessary, since modern transport agents should have no difficulty in handling articles of arbitrary length. On the other hand, "message/partial" may be useful for binaries of excessive length, since reading of the individual parts on their own is not required and they would likely already be encoded in a manner that was 7bit-safe. The Content-Type "message/rfc822" SHOULD be used where complete news articles or email messages are to be included within another article ([USEFOR] 6.21.2). The Content-Type "message/external-body" could be appropriate for texts which it would be uneconomic (in view of the likely readership) to distribute to the entire network. The Content-Types "multipart/mixed", "multipart/parallel" and "multipart/signed" may be used freely in news articles. However, except where policy or custom so allows, the Content-Type: "multipart/alternative" SHOULD NOT be used, on account of the extra bandwidth consumed and the difficulty of quoting in followups. The Content-Type: "multipart/digest" is commended for any article composed of multiple messages more conveniently viewed as separate entities, thus enabling reading agents to move rapidly between them. The "boundary" should be composed of 28 hyphens (US-ASCII 45) (which makes each boundary delimiter 30 hyphens, or 32 for the final one) so as to enable reading agents which currently support the digest usage described in [RFC 1153] to continue to operate correctly. NOTE: The various recommendations given above regarding the usage of particular Content-Types apply also within the individual parts of these multiparts. A multipart is preceded and followed by some spare text (a preamble before the first boundary and an epilogue after the last one). It is clear from [RFC 2046] that these texts are not to be considered part of the official message and SHOULD NOT be displayed by reading agents. It is useful for the preamble to contain words such as "This is a multipart message in MIME format" for the benefit of older reading agents that do not support MIME, but the epilogue SHOULD be empty and, in particular, it SHOULD NOT be used to hold the signature (3.1.2.1), as is sometimes done. 3.1.2.3. Content-Transfer-Encoding The normal expectation ([USEFOR] 6.21.3) is that the Content- Transfer-Encoding will be "8bit (or maybe "7bit" if the charset allows it). Other Content-Transfer-Encodings SHOULD NOT be used unless there are pressing reasons to do so. The following are examples of such situations where a Content- Transfer-Encoding of other that "8bit" may be necessary. 1. The content type implies that the content is (or may be) "8bit- unsafe"; i.e. it may contain octets equivalent to the US-ASCII characters CR or LF (other than in the combination CRLF) or NUL. In that case one of the Content-Transfer-Encodings "base64" or "quoted-printable" MUST be used, and reading agents MUST be able to handle both of them. NOTE: If a future extension to the MIME standards were to provide a more compact encoding of binary suited to transport over an 8bit channel, it could be considered as an alternative to base64 once it had gained widespread acceptance. 2. It is often the case that "application" Content-Types are textual in nature, and intelligible to humans as well as to machines, and where this state can be recognized by the posting agent (either through knowledge of the particular application type or by testing) the material SHOULD NOT be treated as 8bit-unsafe; this has the added benefit, where the posting agent uses other than CRLF for line endings internally, of automatically ensuring that line endings are processed correctly during transport. If, on the other hand, the posting agent recognizes that the material is not textual, or cannot reasonably determine it to be so, then the material MUST be encoded as for 8bit-unsafe (however, in that case, it is the responsibility of the agent generating the material to ensure that lines endings, if any, are represented correctly). NOTE: All the application types defined by [USEFOR], namely "application/news-transmission", "application/news-groupinfo" and "application/news-checkgroups" are textual, and indeed designed for human reading. 3. Although the "text" Content-Types should normally be encoded as 8bit (or 7bit), if the character set specified by the "charset=" parameter can include the 3 disallowed octets, then the material MUST be encoded as for 8bit-unsafe. This is most likely to arise in the case of 16-bit character sets such as UTF-16 ([UNICODE 3.2] or [ISO/IEC 10646]). In addition, where it is known that the material is subsequently to be gatewayed from Netnews to Email ([USEFOR] 8.8), the encoding "quoted-printable" MAY be used (otherwise the gateway might have to re-encode it itself). 4. Some protocols REQUIRE the use of a particular Content-Transfer- Encoding. In particular, the authentication protocol based on OpenPGP defined in [RFC 3156] mandates the use of one of the encodings "quoted-printable" or "base64". Whilst posters might be tempted to risk the use of "8bit" or "7bit" encodings (and indeed the referenced standard recommends that signed messages using those encodings be accepted and interpreted), they should be warned that differences in the treatment of trailing whitespace between OpenPGP [RFC 2440] and earlier versions of PGP may render signatures written with the one unverifiable by the other; and, moreover, Usenet articles are very likely to include trailing whitespace in the form of a personal signature (3.1.2.1). 5. The Content-Type message/partial [RFC 2046] is required to use encoding "7bit" (the encapsulated complete message may itself use encoding "quoted-printable" or "base64", but that information is only conveyed along with the first of the partial parts). NOTE: Although there would actually be no problem using encoding "8bit" in a pure Netnews (as opposed to Email) environment, this document discourages the use of "message/partial" except for binary material, which will likely be encoded to pass through "7bit" in any case. It may be necessary to change the Content-Transfer-Encoding at gateways. For example in the case where such an encapsulated news article with the Content-Type "message/rfc822" is to be transported by email and it has Content-Transfer-Encoding "8bit", the Content- Transfer-Encoding may need to be changed, although there may well be no problems in practice if the email transport supports 8BITMIME [RFC 2821]. 3.2. The Well-Behaved Followup Agent Usenet is primarily a medium for discussion. The majority of articles that are posted are in fact followups to previous articles, and exceedingly complex threads can develop. Therefore, it is essential that user agents provide facilities for followups that will enable such elongated discussions to proceed smoothly. 3.2.1. Construction of Headers The requirements on inserting and editing headers already set out in 3.1.1 still apply, and apply in particular to those headers for which the followup agent has set default values. 3.2.1.1. Subject The Subject of the followup is, by default, taken from that of the precursor, but users are able to override that default; indeed they are to be encouraged to do so whenever appropriate in order to avoid long threads which have wandered far from the topic with which they originated, but which still adhere to the original Subject. It has been a long standing practice, both on Usenet and in Email, to prepend the back-reference "Re: " ([USEFOR] 5.4) to the Subject when preparing a followup, as an indication to the reader that this is a continuation of discussion of an earlier topic rather than the start of a new one. [USEFOR] does not require this practice, but permits it so long as it is not applied if such a back-reference is already present, and provided no string other that "Re: " is used for the purpose. However, the practice is not without its difficulties: 1. Although the "Re" (which is an abbreviation for the Latin "In re", meaning "in the matter of", and not an abbreviation of "Reference" as is sometimes erroneously supposed) may be understood by English speakers, and indeed by speakers of most European Languages, its use in a newsgroup where articles were customarily written in Arabic, or Hindi, or Chinese would be less than helpful. 2. It requires extra processing (to ignore it) in some reading agents which choose to consult the Subject-header when deciding the best order in which to present articles to the reader (see 3.3.2.1). This burden has to be weighed against the relatively small benefit of the indication provided directly to readers. 3. Sometimes, followup agents attempt to use translations of "Re: " into other languages, as in "Sv: " and "Antwort: ". But it is not practicable for those reading agents which take some special note of "Re: " also to take note of translations into an indeterminate number of other languages, and for this reason [USEFOR] makes it clear that such translations SHOULD NOT be used. 4. Even the presence of "Re: " at the start of a Subject may occasionally be misleading, because it might have been deliberately placed there by a poster rather than having been generated automatically by a followup agent. 5. And finally, there are philosophical arguments against features within an unstructured header which imply specific recognition and support within user agents (for reason already explained in 3.1.1.4). Indeed, the only reason why [USEFOR] permits this particular exception is on account of its current widespread usage. For these reasons, this document does not seek to perpetuate this practice, and indeed it might be better if its use were eventually to be phased out. Nevertheless, it is certain that it will continue to happen for some considerable period of time in newgroups where English is the primary language, simply on account of the inertia already behind it. For this reason, section 3.3.2.1 RECOMMENDS striping away any initial "Re: " when comparing Subjects. It would be wiser for any followup agents which can detect apparent non-standard back-references such as "Re(2): ", "Sv: ", etc. to refrain from prepending anything further, but other attempts to mend that problem are likely to do more harm than good. As well as the addition of "Re: ", the Subject-header MAY be refolded (which MAY include collapsing/expanding whitespace to/from a single SP at any point where the folding is changed). However, it MUST NOT (except by deliberate act of the poster) be truncated, extended or changed in any other way that might cause a reading agent to deduce that the subject of a thread had changed. [Bruce wants users users to be requested to confirm that they are happy with the derfault Subject as provided.] 3.2.1.1.1. Examples In the following examples, please note that only "Re: " has any official status (and hence may be utilized by reading agents). "was: " is a convention used by many English-speaking posters to signal a change in subject matter. Software can always recognize that such changes have occurred from the References-header. Subject: Film at 11 Subject: Re: Film at 11 Subject: Godwin's law considered harmful (was: Film at 11) Subject: Godwin's law (was: Film at 11) Subject: Re: Godwin's law (was: Film at 11) Subject: Re: Godwin's law 3.2.1.2. Newsgroups The Newsgroups of the followup are, by default, taken from those of the precursor, or from the precursor's Followup-To header ([USEFOR] 8.6). But if the precursor's Followup-To-header is set to "poster", the user MUST be warned if he attempts to force the followup to be posted. Followup agents SHOULD accept articles crossposted to newsgroups which do not exist on their local hosts (as opposed to posting agents, for which that requirement is only "MAY"). 3.2.1.3. Mail-Copies-To If the user attempts to email the poster as well as to followup, in the case where the Mail-Copies-To-header is absent, and even more so when it is present and there is an explicit "nobody", contrary to the RECOMMENDATION in [USEFOR] section 8.6, then the followup agent SHOULD issue a warning and ask for confirmation. NOTE: This header is only relevant when posting followups to Netnews articles, and is to be ignored when sending pure email replies to the poster, which are handled as prescribed under the Reply-To-header. [USEFOR] RECOMMENDS that where a followup is also emailed to the poster, a suitable Posted-And-Mailed-header be added. NOTE: In addition to the Posted-And-Mailed-header, some followup agents also include within the body a mention that the article is both posted and mailed, for the benefit of reading agents that do not normally show that header. 3.2.1.4. References Followup agents SHOULD trim message identifiers out of a References- header but SHOULD NOT do so until the number of message identifiers exceeds 21, at which time trimming SHOULD be done by removing sufficient identifiers starting with the second from the left so as to bring the total down to 21 (but the first message identifier MUST NOT be trimmed). However, it would be wrong to assume that References-headers containing more than 21 message identifiers will not occur. 3.2.2. Construction of Bodies Followup agents SHOULD follow policies already described for posting agents (3.1.2) regarding the length of lines when generating new text Exceptionally, they SHOULD NOT adjust the length of quoted lines (3.2.2.1) in followups unless they are able to reformat them in a consistent manner. 3.2.2.1. Quoting and Attributions It is customary for the body of a followup to commence with an "attribution" referring to the "precursor" and to "quote" any text copied verbatim from the precursor with a suitable prefix. Followup agents MUST facilitate the automatic incorporation of these things, even though they are not mandated by any standard, in a manner consistent with the conventions described below. These conventions for quotations and attributions describe widely used practices. Since much software will attempt to recognize and act upon them, questions of interoperability can arise, and so the words "MUST", "SHOULD", etc. are here to be understood as more than advisory. When the precursor had used the "format=flowed" parameter of text/plain [RFC 3676], and when the followup agent also supports "format-flowed", flowed paragraphs in the precursor (including any flowed lines within quotations in the precursor) SHOULD be reflowed. Thus, if all agents supported "format=flowed", no physical line, quoted ot not, would ever exceed the default (or policy) limit, except by the deliberate intent of the poster. Where the precursor was not flowed, its lines SHOULD initially be left alone when quoting, except that already quoted lines which appeared (from the presence of trailing SP) to have been flowed by one of the precursor's precursors MAY be treated as such. For use when "format-flowed" is not available, or when it fails to resolve the problem, the poster SHOULD be provided with a facility to rewrap lines of quoted text (but only lines all at the same quoting level). When a followup agent incorporates the "precursor" as a quotation, it MUST be distinguished from the surrounding text in some way, and SHOULD be so dintinguished by prefacing each line of the quoted text (even if it is empty) with the character ">" (or perhaps with "> " in the case of a previously unquoted line). This will result in multiple levels of ">" when quoted content itself contains quoted content, and it will also facilitate the automatic analysis of articles. A facility SHOULD be provided for the poster to select less than the complete body of the precursor in the quotation. NOTE: Whilst posters should edit quoted context to trim it down to the minimum necessary, followup agents SHOULD NOT attempt to enforce this beyond issuing a warning (past attempts to do so have been found to be notably counter-productive). The followup agent SHOULD also precede the quoted content by an "attribution line" (however, readers are warned not to assume that they are accurate, especially within multiply nested quotations). The following convention for such lines is intended to facilitate their automatic recognition and processing by sophisticated reading agents. The attribution SHOULD contain the name and/or the email address of the precursor's poster, as in Joe D. Bloggs wrote: or Helmut Schmidt schrieb: The attribution MAY contain also a single newsgroup-name (the one from which the followup is being made), the precursor's message identifier and/or the precursor's Date and Time. Any of these that are present, SHOULD precede the name and/or email address. However, the inclusion or not of such fields SHOULD always be under the control of the poster. To enable this line, and the message identifier and the email address within it, to be recognized (for example to enable suitable reading agents to retrieve the precursor or email its poster by clicking on them), the following conventions SHOULD be observed: o The precursor's message identifier SHOULD be enclosed within <...> or o The precursor's poster's email address SHOULD be enclosed within <...> o The various fields may be separated by arbitrary text and they may be folded in the same way as headers, but attributions SHOULD always be terminated by a ":" followed by CRLF. Further examples: On comp.foo in <1234@bar.example> on 24 Dec 2001 16:40:20 +0000, "Joe D. Bloggs" wrote: Am 24. Dez 2002 schrieb Helmut Schmidt : 3.2.2.2. Signatures Followup agents, when incorporating quoted text from a precursor, SHOULD NOT include the signature in the quotation. 3.2.2.3. Usage of MIME Followup agents which quote parts of a precursor SHOULD initially include all parts of the precursor that were displayed inline, as if they were a single part. 3.3. The Well-Behaved Reading Agent 3.3.1. Display of Headers The set of headers displayed to the reader of each article is usually a configurable option, but it MUST by default include at least the following: o From-header o Subject-header o Newsgroups-header o Followup-To-header o Reply-To-header and SHOULD include the following: o Distribution-header o Posted-And-Mailed-header o Summary-header o Control-header Moreover it MUST include a facility to "Display ALL headers" on demand. NOTE: There is no necessity to display anything at all for a header that is completely absent, and indeed some of the ones listed may seldom, if ever, be seen; they are included simply because it is essential for the reader to be aware of them on the rare occasions when they do occur. There is no requirement to display the headers in the exact format defined in [USEFOR] (for example, header-names may be displayed in some local language), but all the information in each header MUST be shown in some form. If may be necessary to provide a scrollbar or some equivalent means in order to display long headers (particularly long Subject-headers); alternatively, long headers can be folded ([USEFOR] 4.2.3) for display (although any folding provided by the original poster SHOULD be retained if it will fit in the display area). In any case, the display area SHOULD be able to accomodate up to 79 characters (or other established policy limit - see 7.3). Even if some header appears to be non-compliant with [USEFOR], it is better to display it exactly as received rather than not to display it at all. [A better reference for these length limits is needed.] Reading agents need to be prepared for ancient usages (and even non- compliance) which nevertheless still appear from time to time. In particular, the following is often seen: fred@foo.bar.example (Fred Bloggs) as opposed to "Fred Bloggs" . The former is a deprecated, but commonly encountered, usage and reading agents SHOULD take special note of such comments as indicating (e.g. in killfiles) the name of the person whose mailbox it is. 3.3.2. Presentation of Articles 3.3.2.1. Threading Reading agents SHOULD present the articles in each newsgroup in an order which ensures that the reader never sees a followup or reply to an article unless he has already had an opportunity to read the original. However, this may be easier said than done. Here are some methods commonly used to fulfil this aim; none of them works perfectly. 1. Present the articles in the order they were received at the local serving agent. However, articles propagated via different routes with different delays may well arrive out of order, so this may not be reliable. 2. Sort the articles into order according to their Date-headers. This will usually be better than the first method, but relies on the clock and timezone settings in posting agents being approximately correct. And although it satisfies the minimal recommendation at the head of this section, it will likely result in totally separate threads of discussion being merged in an unhelpful order. 3. Sort the articles according to their Subject-headers (or group them according to their Subject-headers, with the groups being presented in order of the Date-header). Within a group with the same Subject, sort according to the Date-header. This works tolerably well, but within a long discussion with many divergent subthreads, those subthreads are still merged in an unhelpful order. Moreover, it will occasionally bring together totally unrelated articles that just happen to have the same Subject by chance. 4. Construct a tree in which each article is within a sub-tree headed by each article mentioned in its References-header, and present articles by a depth-first traversal of that tree, sorting the siblings within each branch according to their Date-headers. This method is usually superior to the ones mentioned earlier, but it can go wrong for a number of reasons. a) References-headers are sometimes absent, or incomplete (and are even permitted to be trimmed when they get too long), and earlier articles in the threads may have expired off the local server. Nevertheless, with careful implementation, these problems are mostly surmountable. b) A poster may join an existing discussion (and clearly intend to do so by using the same Subject-header, possibly with a prepended "Re: ") and yet his article might not be created as a followup to any specific precursor and hence would not have a References-header. Hence it would be presented quite apart from the other (sub-)threads of that discussion. c) Conversely, the topic of some sub-thread might have diverged so far from the original topic of discussion that some poster decides to create a totally new Subject for his followup. Nevertheless, that followup, and the whole sub- thread which issues from it, will still be presented in the midst of the other sub-threads of the original discussion. 5. To counter these various deficiences, various hybrid schemes have been devised which take account of all three headers, References-, Subject- and Date-, and these often succeed in providing a more pleasing presentation to the reader. However, different readers can be pleased in different ways, and so it is often the case that reading agents provide configurable options to choose between several methods. This document does not single out any particular method as "the best". They are all to be considered acceptable, and implementors are encouraged to experiment accordingly. Nevertheless, it is inevitable that some combination of Subjects and followups will eventually arise that defeats even the most sophisticated scheme. It must be noted, however, in the case of those methods which rely on the comparison of Subject-headers, whether to detect equality or for sorting, that there are certain additional precautions that need to be taken, such as: a) [USEFOR] permits a back-reference "Re: " to be prepended (optionally) to a Subject when creating a followup. Therefore, that back-reference SHOULD be stripped away before performing any comparison of Subjects. On the other hand, "Re:" is the only back-reference permitted, and therefore it is not necessary for translations of "Re: " into other languages to be recognized (even though such translations are sometimes generated by non-compliant followup agents). Likewise, that "Re: " is case-sensitive, although non-compliant agents that generate "RE: " are common enough that it might be wiser to accept that form also. [The above wording is subject to change according to what is finally said in [USEFOR].] b) It is not unknown for non-compliant followup agents to truncate the Subject-header. Some reading agents therefore truncate the Subject before making any comparison. Sometimes this makes things better; sometimes it makes them worse. c) The use of encoded-words ([RFC 2047]) within Subject-headers can give rise to different ways of encoding the same Subject. Therefore, such encoding SHOULD be undone before any comparison of Subject-headers is made. It cannot even be assumed that the back- reference "Re: " is not within an encoded-word. [It is possible that this matter will ultimately be addressed in [USEFOR] rather than here.] 3.3.2.2. Killfiles The reader SHOULD be provided with a list of articles available for reading, as set out more fully in section 3.3.2. Within the list of articles presented, the reader SHOULD be given the choice of seeing only articles that have not yet been read, or of seeing all articles available in the particular newsgroup. Moreover, articles crossposted to many newsgroups SHOULD be considered to have been read once they have been seen in any of those groups. There SHOULD be a facility (usually known as a "killfile") for filtering out classes of article that the reader does not wish to see. As a minimum, it SHOULD be possible to filter on the Subject- header (preferably using regular expressions to describe the filter) and on the From-header, but ideally it should be possible on any header (so, for example, it would be possible to filter out excessive crossposting, or crossposts to particular groups). Moreover, it SHOULD be possible to filter out all followups to some given article, by filtering on the References-header or by building upon whatever threading facility has been provided. The filters included in a killfile may be permanent, or for a limited period. A corresponding set of filters to preselect articles for reading MAY also be provided. 3.3.3. Interpretation of Bodies Implementors of reading agents need to be aware of ancient usages (and even non-compliance) which nevertheless still appear from time to time, and SHOULD endeavour to recognize them and display them appropriately. An example of this is the use of Backspace by posting agents in order to construct composite characters (e.g. by underlining) (3.1.2). Tab (US-ASCII 9) SHOULD be interpreted as sufficient horizontal white space to reach the next of a set of fixed positions (customarily set at every 8th character). Formfeed (US-ASCII 12) (which is sometimes referred to as the "spoiler character") signifies a point at which the reading agent SHOULD pause and await reader interaction before displaying further text. Reading agents MUST provide facilities to display the whole of long lines up to the maximum of 998 characters (whether by wrapping or by providing horizontal scroll bars). However, cutting and pasting of wrapped lines SHOULD copy the original unwrapped line (i.e. all CRLFs not in the original should be discarded). 3.3.3.1. Usage of MIME Even though this document, or applicable policy, may discourage the use of some Content-Types, all reading agents SHOULD make some realistic attempt to display at least all text types (especially where the Content-Disposition is "inline", even if all that can be done is to exhibit any formatting information as received (thus allowing a suitably knowledgeable reader to interpret it manually). The same applies to unrecognized charsets. It is not expected that reading agents will necessarily be able to present characters in all possible character sets (for example, a reading agent might be able to present only the ISO-8859-1 (Latin 1) characters [ISO 8859]), but where unpresentable characters arise they SHOULD be presented in some escaped notation, e.g. octal or hexadecimal (rather than as some single distinctive glyph or by exhibiting a warning). Reading agents MAY interpret image, audio and video Content-Types inline, but few in fact do so (and the use of such Content-Types is anyway deprecated in the absence of established policy to the contrary - see 3.1.2.2). Likewise, reading agents MAY interpret "application" types (and SHOULD at least display those types which are inherently textual in nature). However, there are security risks inherent in some application types, and even in "text/html" ([USEFOR] 9.2.2). Even requiring the reader to click on some icon before proceeding with the application has proven notoriously ineffective against malicious attacks. The only safe alternative is to execute the application within a protected environment, or "sandbox", outside of which its side effects cannot occur. Of the multipart Content-Types, reading agents MUST handle correctly at least "multipart/mixed" and "multipart/alternative". Other multipart types that are not implemented directly MUST be treated as "multipart/mixed". It is a regular practice for some Usenet articles to consist of digests of other messages or informative documents (usually known as "FAQ"s). These take the form of digests, as defined in [RFC 1153] or of the MIME Content-Type "multipart/digest". Reading agents SHOULD recognize both of these formats and enable the individual digest items to be presented separately, as if they were separate articles. Reading agents SHOULD honour any Content-Disposition-header that is provided (in particular, they SHOULD display any part of a multipart for which the disposition is "inline", possibly distinguished from adjacent parts by some suitable separator). In the absence of such a header, the body of an article or any part of a multipart with Content-Type "text" SHOULD be displayed inline. 3.4. The Well-Behaved Reply Agent First and foremost, a reply agent is an Email agent, and therfore its primary responsibility is to generate messages that are compliant with [RFC 2822] and other applicable Email standards and conventions. When a reply is to be emailed to the poster of an article, the reply agent MUST initially create a To-header from the Reply-To- or From- header, as appropriate, of the precursor. NOTE: A distinction is to be made between when a reply is emailed to the poster of an article, and when such a reply is also posted during the course of generating a followup; in the latter case (but not the former) it is expected that any Mail- Copies-To header will have been observed. Note also that use of the Posted-And-Mailed header is appropriate whenever a message is both posted and emailed, whether or not this is done during the course of a formal followup. Since addresses ending in ".invalid" are undeliverable, reply agents SHOULD warn any user attempting to reply to them and SHOULD NOT, in any case, attempt to deliver to them (since that would be pointless anyway). 3.5. User Interfaces The basic functionalities provided to the poster MUST include the following: o To Post a new article; o To Followup to a an existing article; o To Reply by email to the poster of an existing article (assuming, of course, that an email capability is available); and SHOULD include o To Cancel or Supersede articles previously posted by that same user (though this mechanism MUST NOT be available for other people's articles). The commands provided to the user to instigate these operations SHOULD, in the case of agents designed for English speakers at least, make use of the words "Post", "Followup", "Reply" and "Cancel" (or of their initial letters). It is NOT sufficient, particularly in the case of user agents intended for dual Email and Netnews use, simply to re-use the words for the corresponding Email operations (such as "Send" instead of "Post" or "Reply" instead of "Followup" or "Delete" instead of "Cancel"). It SHOULD be immediately evident to an ordinary, untrained user which command to use for each of the operations. There MAY also be a separate facility encompassing "Followup and Reply", but in that case the provisions of any Mail-Copies-To-header in the precursor (3.2.1.3) SHOULD be observed. It SHOULD be possible to switch between the Followup, Reply and "do both" commands, even after the article body has been edited. If, for whatever reason, there is only one command encompassing all of these operations, its default action MUST be to Reply (with no possibility to configure it otherwise). The user is responsible for providing at least the Newsgroups- and Subject-headers of the new article; in the case of Followups and Replies, it is usual for the user agent to provide defaults for these, but in all cases facilites for the user to edit these MUST be provided. In particular, it MUST be possible to specify multiple Newsgroups (the effect of which MUST be for them to be cross-posted rather than multi-posted), but the poster SHOULD be prevented (or at least warned) from excessive crossposting and SHOULD be offered the opportunity to set a Followup-To-header if he insists on an excessive cross-post. Excessive numbers of newsgroups in a Followup-To-header SHOULD be discouraged likewise. The user MUST be warned (and SHOULD be prevented) if he attempts to post an article whose body is empty, or which contains only quoted text. When the article is finally posted, the user MUST be warned (with severe wording) (and SHOULD be prevented) if he attempts to post the same article again, unless the system is able to report explicitly that the original posting had failed. See section 3.1 for further requirements and recommendations to be followed when posting articles. [ISO 8859] International Standard - Information Processing - 8-bit Single-Byte Coded Graphic Character Sets. Part 1: Latin alphabet No. 1, ISO 8859-1, 1987. Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO 8859-4, 1988. Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. [ISO/IEC 10646] "International Standard - Information technology - Universal Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane", ISO/IEC 10646- 1:2000, 2000. [RFC 1153] F. Wancho, "Digest Message Format", RFC 1153, April 1990. [RFC 1847] J. Galvin, S. Murphy, S. Crocker, and N. Freed, "Security Multiparts for MIME: Multipart/Signed and Multipart/Encrypted", RFC 1847, October 1995. [RFC 2046] N. Freed and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [RFC 2047] K. Moore, "MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text", RFC 2047, November 1996. [RFC 2396] T. Berners-Lee, R. Fielding, U.C. Irvine, and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998. [RFC 2440] J. Callas, L. Donnerhacke, H. Finney, and R. Thayer, "OpenPGP Message Format", RFC 2440, November 1998. [RFC 2606] D. Eastlake and A. Panitz, "Reserved Top Level DNS Names", RFC 2606, June 1999. [RFC 2821] John C. Klensin and Dawn P. Mann, "Simple Mail Transfer Protocol", RFC 2821, April 2001. [RFC 2822] P. Resnick, "Internet Message Format", RFC 2822, April 2001. [RFC 3156] M. Elkins, D. Del Torto, R. Levien, and T. Roessler, "MIME Security with OpenPGP", RFC 3156, August 2001. [RFC 3676] R. Gellens, "The Text/Plain Format and DelSp Parameters", RFC 3676, February 2004. [UNICODE 3.2] The Unicode Consortium, "The Unicode Standard - Version 3.2, being an amendment to [UNICODE 3.1]", Unicode Standard Annex #28 , 2002. [USEFOR] Charles H. Lindsey, "News Article Format", draft-ietf- usefor-article-format-*.txt.