Tim Berners-Lee
Date: 1998, last change: $Id: Notation3.html,v 1.49 2001/11/27 23:59:33 timbl Exp $
Status: Tries to keep ahead of changes to the code. There are now machine-readable grammars for N3.

An RDF language for the Semantic Web

Notation 3

This is a language which is a compact and readable alternative to RDF's XML syntax, but also is extended to allow greater expressiveness. It has subsets, one of which is RDF 1.0 equivalent, and one of which is RDF plus a form of RDF rules.

Aims:

To optimize expression of data and logic in the same language
To allow RDF to be expressed
To allow rules to be integrated smoothly with RDF
To allow quoting so that statements about statements can be made.

The language achieves these with the following features:

URI abbreviation using prefixes which are bound to a namespace (using @prefix) a bit like in XML,
Repetition of another object for the same subject and predicate using a comma ","
Repetition of another predicate for the same subject using a semicolon ";"
Bnode with a certain properties just put the properties between [ and ]
Formulae allow N3 graphs to be quoted within N3 graphs using { and }
Variables and quantification allow rules, etc to be expressed
The grammar is simple and consistent.

Resources on Notation3 include the following:

A primer for getting into RDF using N3
A tutorial on treating Semantic Web data, taught using N3
Examples
SWAP: cwm and other tools; Open Source.
Design alternatives considered in the design of Notation3
Syntax definition of N3 (in RDF/XML; in RDF/N3)
n3-mode for emacs
Old grammar, hand-written.

N3 Subsets

For various purposes it is useful to define limited subsets of the language.

NTriples is an extremely constrained language which uses hardly any syntactic sugar at all: it is optimized for reading by scripts, and comparing using text tools. It just allows one triple on each line. It was designed for the RDF test suite parser reference output. (It is not committed to be an exact subset of N3, and currently specified uses a \uXXXX form for encoding unicode characters in URIs, in variance with N3, and the IRI drafts which use %XX encoding of utf-8. Currently this output option is a --n3=u flag in cwm).

Turtle is another subset, for only expressing RDF. It is like n3-rdf below except that it does not have the path syntax.

Grammar for N3 is available in RDF, and I have made a few subsets which may be useful, below. I haven't tested the subsets much, mail me if you are interested in using them in earnest.

Context-free grammar descriptions of N3
N3 the full language	N3	RDF/XML	HTML	yacc
N3-rdf (under development). This is an N3 language which is constrained so that only correct RDF graphs can be defined. This is all you need for data.	N3	RDF/XML	HTML
N3-rules (under development). This subset allows {} only for making rules like {...} =>{...}, to be equivalent to various other rule languages out there.	N3	RDF/XML	HTML
N3-QL (under development) Restrictions very similar to N3-Rules	N3

Comparison of N3 subsets
Feature	Expresses RDF 1.0	@prefix [], ; a	Collections	Numeric literals (int)	Numeric literals (float)	Literal subj	RDF Path	Rules	Formulae
syntax:			(<a> <b>)	2		7 a n:prime.	x!y^z	{?x}=>{?x}	{} @forAll
NTriples	y
Turtle	y	y	y	y 2004/2
N3 RDF	y	y	y	y	y	y	y
N3 Rules	y plus	y	y	y	y	y	y	y
N3	y plus	y	y	y	y	y	y	y	y

Try it!

see also SWAG: Notation3 to RDF Converter, which (as of Aug 2001) is more actively maintained than this form.

Encoding

N3 files are encoded in UTF-8 (See RFC2279), in normalized in Normalization Form C. The language is defined in terms of a sequence of Unicode characters. (Implementations may chose to implement using 8-bit bytes, passing bytes >7F transparently, but this will not allow them to check the validity of embedded non-ASCII characters. A full unicode implementation would in some ways be preferable. The exact set of unicode character to be allowed in (a) literal strings or (b) identifiers is not yet clear.)

Mime type

This document defines (more or less) the allowable syntax and, for documents of valid syntax the meaning, of a document in a proposed text/rdf+n3 mime type. This mime type is not used with a charset parameter: the encoding is always utf-8.

(The type application/n3 was applied for at one point (2002?) but I have no trace of any reply from IANA.)

BNF Grammars and parsers

The original hand-written grammar is a bit crufty; several more formal grammars have been developed. The n3.n3 grammar is definitive.

Some parsers for Notation3 and its subsets (2004)
Parser	level	written in	Author	Notes
notation3.py	N3	Python	Connolly & Berners-Lee	Handwritten, used in cwm, W3C open source
RDF/N3 parser.	?	Python	Graham Kline
n3.py	No () or {}	Python	Sean Palmer	2002. Still extant? Part of Eep
afon	N3	Python	Sean Palmer	Uses regexps. Roughly same speed as notation3.py
RDF::Notation3	?	Perl	Ginger Alliance
notation3 parser	N3	Java	Jos de Roo	Part of Agfa's Euler
N3 parser (?)	?	Java	Andy SeabourneHP Labs	Part of HP's Jena
parser	~Turtle, sans lists	PHP	Gunnar Grimnes	Based on SBP's n3.py, GPL
RAP	N3-rdf?	PHP	chris@bizer.de (Chris Bizer)	RDF API for PHP is a software package for searching, manipulating, and serving RDF models, integrated RDF/XML, N3 and N-Triple parser and serializer.
Raptor	Turtle	C	Dave Becket	Redland compatible
SWI-Prolog	Turtle	prolog	JanWielemaker	parser
n3.bnf	?	blindfold	Sandro Hawke	Blindfold is a bnf-driven parser.
flaten3	?	lex, yacc	Sandro Hawke	"A first pass at an n3 parser using lex and yacc"
n3spark.py	?	spark	Sandro Hawke	A re-implementation of RDF/n3 syntax using the SPARK toolsAug 2001; discussion
rfdn3-gram	?	yapps	Dan Connolly	a Yapps grammar for RDF Notation 3Aug 2001; discussion
Eulermoz	?	Javascript	Euler team	Euler-inspired inference in Mozilla
n3p.py	?	Python	Sean Palmer	See email

This list may be inaccurate and probably out of date. Mail me with differences you know about, and if you are using this stuff check the web sites and google for new implementations

Test Suites

Here are some manifests of test files. A positive parser test is one which an N3 parser should parse. It has a set of NTriples which should be produced. A negative parser test is a file which should produce an error when an N3 parser tries it.

Test files
Tests	Level	Author	Notes
SWAP n3parsertests.n3	N3	Scharf et al	Positive and negative parser tests.
Turtle tests	Turtle	DavidBeckett	Positive and negative parser tests
NTriples examples	NTriples	?	Do tests exist? Maybe examples from the RDF specs.

Parser implementers are encouraged to generate test results in RDF in the same form, so that results from multiple implementations can be tabulated.

Syntax details

Tokenizing not explicitly specified in that white space is not in the BNF for simplicity here. White space must be inserted whenever the following token could start with a character which could extend the preceding token. Whitespace may not be inserted within a token. An exception may be that we allow and remove whitespace within a URI in angle brackets.

All URIs are quoted with angle brackets. Whitespace within the <> is to be ignored. Whitespace may therefore be usd on output to split a long URI between lines.

Qualified names have colons, so unquoted alphanumerics are all keywords, unless the @keywords directive is given, in which case the keywords given are keywords and anything else is a localname in the default namespace. Any keyword may be given even if not in the keyword list by prefacing it with "@". Because keywords are declared in this way, we will have the freedom later to make extensions to the syntax using new keywords without fear of ambiguity. However, the tokenizer has to be aware of the @keywords setting.

The @prefix directive binds a prefix to a namespace URI. It indicates that a qname with that prefix is a shorthand for a URI consisting of the concatenation of the namespace identifier and the bit of the qname to the rigtht of the (only allowed) colon.

The empty prefix "" may be, and is by default (2004/3 on), bound to the empty URI "". this means that <#foo> can be written :foo saving two bytes(!). With the @keywords one can reduce that to foo.

The """value""" string form is used simply for multi-line values or values containg quote marks.

(To do:Define literal data to arbitary terminator ( """"""zyzzy"This is a string"zyzzy"""""") or something?)

The Unicode sets are not well defined - be conservative in what you use and liberal in what you accept. This may be the subject of futher study. (This should be somebody else's problem, in that the set of name characters is common for many langauges. In this case, the treatment of unicode characters outside the ASCII set is done compantibly with the XML1.1 standard.

Semantics

In property lists, the semicolon ; is shorthand for repeating the subject, in object lists the comma "," is shorthand for repeating the verb.

Anonymous nodes

[ pl ] means x, where there exists some x such that pl holds. So,

[ :firstname "Ora" ] dc:wrote [ dc:title "Moby Dick" ] .

is a statement (false, I suspect) which would be means in math

exists x, y . firstname(x, "Ora") & dc:wrote(x,y) & dc:title (y, "Moby Dick")

or in english "Some person who has a first name Ora wrote a book entitled "Moby Dick". Note not "the book" or "the person".

This can equally well be written

[x:firstname "Ora" ; dc:wrote [dc:title "Moby Dick" ]].

[] x:firstname "Ora" ; dc:wrote [dc:title "Moby Dick" ].

Paths

These are just shorthand. x!p means [ is p of x ] in the above annonymous node notaion. You can read it as "x's p". This is a liitle reminiscent of the "." in object oriented programming "object.slot" syntax. (Note the "." could in fact historically be used instead of "!" but if it is it must be immediately followed by the next path element with no whitespace. This is to distinguish it from the trailing "." of a statement. The tokenizer needs to look ahead one character to resolve these. This use of "." is deprocated.)

The reverse traversal, x^p means [ p x ] . For either forward or backward traversal, p is a property, and x can be a whole path with both ! and ^ in it. Any path with at least one traversal is anonymous.

Example:

:joe!fam:mother!loc:office!loc:zip Joe's mother's office's zipcode

:joe.fam:mother^fam:mother Anyone whose mother is Joe's mother.

Note: The path traversal operator was just "!" and then was "." and currently is either. This is an open design question and parsers should accept either, but generators should only use "!".

Formulae

An RDF document parses to a set of statements. N3 allows such a set to be itself to be referred to within the language, and calls it a formula. A { statementlist } is a formula whose meaning is the the logical conjunction (equivalent to syntactic juxtaposition) of the statements in the list. It is a set, as the same statement occuring more than once has no meaning. It is unordered set. It is an RDF graph.

Apart from the set of statements, and extending the basic RDF graph, it also has a set of URIs of symbols which are universally quantified, and a set of URIs of symbols which are existentially quanitified.

Example:

{ [ x:firstname  "Ora" ] dc:wrote [ dc:title  "Moby Dick" ] } a n3:falsehood .

This claims that the expression in {} is false - that there is nothing called Ora which wrote anything titled "Moby Dick".

A formula is considered, like a literal string, to be defined only by its contents.

As well as a set of statements, a formula comprises two sets, one of URIs which will be used as universal variables, and one URIs which will be used as existential variables.

The semantics of a formula are than the contents are quoted. Variable substitution does recursively take place within a formula, but substitution of equals does not. The variable substitution is used for example when formulae are used for rules, and patch file formats. See the tutorial introduction to rules.

Certain propoerties may, by their semantics, allow the propagataion of substitution of equals, by agents which are aware of that semantics. So for example, if the statement { F ex:or G } is true, where F and G are formulae, then it ise useful to define a disjunction operator ex:or such that if a = b, then it is also true that { F ex:or G' } where G' is the result or substituiting b for a in G.

Terminolgy: Context and Formula

The terminology is that the set of statements is a formula; the particular formula is which a statement is found is its context. "Formula" is therefore the class , and "context" is the relationship between statement and a formula.

Blank node identifiers in Formulae

N3 allows the _: namespace as in NTriples. These identifiers are used to identify blank nodes in the graph. These are generalized in N3 such that they are used to identify blanknodes in the local formula. They are arbitrary temporary names for nodes which are existentially quanitified within the current formula (not the whole file).

Quantification

The @forAll directives declare variables which are universally quantified: the formula is true for any value of the variable. Similarly, @forSome gives an existential quantification: there exists some value of the variable for which the context is true. (Note also that the square bracket notation, introduces a blank node, which is an unnamed anonymous existential variable, and aslo the _: namespace from the NTriples spec which is a dummy namespace in which is used to represent nodes which are existentially quantified and unnamed.)

{ @forSome <#g>. <#g> <#loves> <#you> } .

is equivalent to

[ <#loves> <#you> ] .

If both universal and existential quantification are specified for the same context, then the scope of the universal quantification is outside the scope of the existentials:

{ @forAll <#h>. @forSome <#g>. <#g> <#loves> <#h> }.

means

for all h
   for all g
     g loves h

("Every has someone who loves them" rather than "Somebody loves everybody")

which you might think of as

∀h(∃g(loves(g,h))

String escaping

Escaping in strings uses the same conventions as Python strings except for a \U extension introduced by NTriples spec. N3 strings represent ordered sequences of Unicode characters.

Some escapes (\a, \b, \f, \v) should be avoided because the corresponding characters are not allowed in RDF. This is not yet (2001/3) implemented in the cwm parser

Escape Sequence	Meaning
`\newline`	Ignored
`\\`	Backslash (`\`)
`\'`	Single quote (`'`)
`\"`	Double quote (`"`)
`\n`	ASCII Linefeed (LF)
`\r`	ASCII Carriage Return (CR)
`\t`	ASCII Horizontal Tab (TAB)
`\ooo`	byte with octal value `ooo` (depreciated**)
`\xhh`	byte with hex value `hh` (depreciated**)
`\uhhhh`	character in BMP with Unicode value U+`hhhh`
`\U00hhhhhh`	character in plane 1-16 with Unicode value U+`hhhhhh`

Note that in N3, the double quote character is used for strings. The single quote character is reserved for future use. The single quote character does not need to be escaped in an N3 string.

**RDF and N3 are defined in terms of characters, not bytes. Therefore, the \ooo and \xhh escapes are deprecated.

The hexadecimal digit as in unicode escapes are UPPERCASE. This is designed to match the NTriples strings.

Identifier munging

This syntax does not allow minus signs in identifiers, whereas the XML encoding for RDF does.

The current solution is mapping sequences of "-" and "_" into sequences of "_" by taking a contiguous sequence of - and _, replacing _ with 0 and - with 1, then adding a leading "1", taking what you have as a binary number, subtracting 1 from the result, and then writing that many _ signs. The mapping is 1:1, and maps the simple case of - onto __ and _ onto _. The only disadvantage is that those who go crazy with n consecutive occurrences of - and/or _ in XML will pay for it in an even crazier 2**n long sequence in the N3. (2001/12/4)

A messy thing from the N3 point of view is that for content negotiation to work between RDF/XML and N3, the fragment identifier sysntax must be the same, and this would suggest that both use the "-" (XML) form.

It would be nice to be able to outlaw - in IDs for RDF.

Notes on Numbers

The N3 grammar describes the syntax for various literal productions. These syntax strings identify values which are members of various classes of number. The bnf productions should not be confused with the relationship between number classes. In the syntax, integer, rational and real productions are distinct. When it comes to the values they represent, all integers are also rational numbers, and all rationals are also real numbers.

My assumption about the number model is that there is no distinction between the rationals 1.0 and 1/1 and the the integer 1. I haven't implemented this yet, but my intent is that the semantics of these numbers is true to normal mathematical equality.

The issue of reals is more complicated, as any real literal is necessary approximate. So while there is a real number which is equal to the rationals 1 or 1/3, reals do not support this comparison. (See XML Schema Datatypes)

Whilst complex numbers (with integer, rational or real parts) are a reasonable class to add, I don't see it as a priority at the moment.

Reification

Notation3, of any level, can be represneted as an RDF graph using a vocabulary <http://www.w3.org/2004/06/rei#>. Code exists (cwm --reify and cwm --dereify) to convert N3 to such a description and back. This allows, for example, rule files to be manipulated as RDF data.

This reification also allows an RDF graph to be described, quoted, within another RDF graph. (Note that the "reification" described in the RDF 1.0 specification is different, doe snot quote properly, and is not recommended.)

Identifying

A notation3 document has fragment identifiers of the form of alphanumeric strings with inital alpha, underscores allowed but not minus signs.

To label something you just invent that name and use it. There is no distinction between definition and reference. This is a fact of life, not of RDF or notation3. (A definitive reference is one in a document demonstrated to be definitive with respect to the namespace... but that is another story.)

<#ora> x:firstname "Ora".

<#ora> x:lastname "Lassila".

or indeed

:ora x:firstname "Ora"; x:lastname "Lassila".

Issues

A summary of recent design issues in Notation3:

Should - signs be allowed in local identifiers and fragment identifiers? (yes)
Should parsers use URI canonicalization to convert RDF symbols on input? Similarly, should we restrict all URIs to be canonicalized on output? (A processing option, on by default, off to be strict for RDF tests)
Should we allow and remove whitespace within a URI in angle brackets? This makes to easy to split long URIs in things like email or 80 column source code, and was rather why the <> were originally assciated with URIs. (Yes, not implemented though, or any tests.)
Should we allow "." as an alternative to "!" as a forward path traversal operator? This makes the tokenizer difficult because of the overloading of ".", but ity looks cute for object-oriented users. (No. cwm curently does and the tests include it).
Define literal data to arbitary terminator ( """"""zyzzy"This is a string"zyzzy"""""") or something? (No, not yet)
Define a syntax for an unorded set, as compared with the ordered collections which is described by the (list parens). Currently, the RDF description we use for set, for example, of the integers 13, 51 and 6 is [ owl:oneOf ( 13 51 6 ) ]. Various ayntaxes have bee

Much detail on other and old issues are listed separately.

Acknowledgements

Dan Connolly wrote the first N3 parser. Sean Palmer and other folks on irc://openprojects.net/rdfig suggested many things and reviewed new ideas (and scrapped old ones!). Thanks to all implementers.

Change History

2003/03

Added literal numbers

2003/02

Added @forall @forsome

2001/04/10

Removed \a, \b, \f, \v because they are not allowed in XML. Changed "ASCII character" to "byte" for \ooo and \xhh. (duerst)

2000/12/29

Switched from bind to @prefix. Code still support sbind.

Added ( node node ...) list shorthand, as code now reads and writes it.

Added a little about self-describing documents.

$Log: Notation3.html,v $

Revision 1.123 2005/01/12 17:07:57 timbl

(timbl) Changed through Jigsaw.

Revision 1.122 2004/12/16 02:53:49 timbl

(timbl) Changed through Jigsaw.

Revision 1.120 2004/11/17 17:50:28 timbl

(timbl) Changed through Jigsaw.

Revision 1.117 2004/11/03 20:22:28 timbl

(timbl) Changed through Jigsaw.

Revision 1.115 2004/11/03 20:08:18 timbl

(timbl) Changed through Jigsaw.

Revision 1.113 2004/07/03 13:33:38 timbl

editing

Revision 1.112 2004/06/25 11:16:19 timbl

(timbl) Changed through Jigsaw.

Revision 1.103 2004/06/24 19:16:30 timbl

(timbl) Changed through Jigsaw.

Revision 1.99 2004/06/11 13:31:52 timbl

(timbl) Changed through Jigsaw.

Revision 1.98 2004/06/08 18:48:53 timbl

Change mime type?

Revision 1.97 2004/04/16 15:13:48 timbl

plit out olf grammar.

Revision 1.96 2004/03/01 15:41:29 timbl

(timbl) Changed through Jigsaw.

Revision 1.95 2004/02/01 04:28:29 timbl

Add table of parsers

Revision 1.94 2004/01/31 22:43:10 timbl

Add table of features

Revision 1.91 2003/12/02 21:54:40 timbl

oops

Revision 1.90 2003/12/02 21:34:25 timbl

pointers to bnf

Revision 1.89 2003/12/02 21:33:31 timbl

pointers to bnf

Revision 1.88 2003/10/31 17:58:17 timbl

Fixed conflicts

Revision 1.87 2003/09/09 16:30:07 timbl

(timbl) Changed through Jigsaw.

Revision 1.86 2003/04/01 15:37:51 timbl

(timbl) Changed through Jigsaw.

Revision 1.85 2003/03/19 19:13:35 timbl

(timbl) Changed through Jigsaw.

Revision 1.81 2003/03/18 16:03:12 timbl

(timbl) Changed through Jigsaw.

Revision 1.80 2002/10/19 19:53:18 timbl

DRAFT

Revision 1.79 2002/08/16 22:22:44 timbl

(timbl) Changed through Jigsaw.

Revision 1.68 2002/05/31 20:42:47 timbl

(timbl) Changed through Jigsaw.

Revision 1.67 2002/05/24 16:17:32 timbl

(timbl) Changed through Jigsaw.

Revision 1.66 2002/04/03 19:48:40 timbl

(timbl) Changed through Jigsaw.

Revision 1.64 2002/03/15 21:22:37 timbl

(timbl) Changed through Jigsaw.

Revision 1.63 2002/03/15 13:24:52 timbl

(timbl) Changed through Jigsaw.

Revision 1.62 2002/02/05 23:29:42 duerst

(duerst) Changed through Jigsaw.

Revision 1.61 2002/01/23 19:30:41 timbl

(timbl) Changed through Jigsaw.

Revision 1.58 2002/01/11 19:10:37 connolly

(connolly) Changed through Jigsaw.

Revision 1.56 2001/12/05 20:05:56 timbl

(timbl) Changed through Jigsaw.

Revision 1.53 2001/12/04 18:39:38 timbl

(timbl) Changed through Jigsaw.

Revision 1.50 2001/11/28 20:20:01 timbl

(timbl) Changed through Jigsaw.

Revision 1.49 2001/11/27 23:59:33 timbl

(timbl) Changed through Jigsaw.

Revision 1.48 2001/11/27 23:58:39 timbl

(timbl) Changed through Jigsaw.

Revision 1.47 2001/11/27 21:11:31 timbl

(timbl) Changed through Jigsaw.

Revision 1.46 2001/11/27 21:05:12 timbl

(timbl) Changed through Jigsaw.

Revision 1.42 2001/11/19 20:43:17 connolly

(connolly) Changed through Jigsaw.

Revision 1.41 2001/09/18 12:19:52 timbl

(timbl) Changed through Jigsaw.

Revision 1.40 2001/09/01 03:16:35 connolly

(connolly) Changed through Jigsaw.

Revision 1.38 2001/04/10 03:16:01 duerst

(duerst) Changed through Jigsaw.

Revision 1.34 2001/04/03 23:09:40 connolly

fixing n3 example in textara again...

Revision 1.33 2001/04/03 23:08:10 connolly

oops fixed textarea

Revision 1.32 2001/04/03 23:07:19 connolly

updated form

Revision 1.31 2001/04/01 20:12:22 timbl

(timbl) Changed through Jigsaw.

Revision 1.27 2001/03/16 18:22:47 timbl

(timbl) Changed through Jigsaw.

Revision 1.25 2001/03/15 15:33:30 timbl

(timbl) Changed through Jigsaw.

Revision 1.23 2001/02/02 02:43:20 timbl

updates

Revision 1.22 2001/01/05 23:04:01 timbl

Considered design alternatives

Moved to a separate page

References

N3-relevant functions

These are not part of the N3 language, but are properties which allow N3 to be used to express rules, and rules which talk about the provence of information, etc. Just as OWL is expressed in RDF by defining properties, so rules, queries, differences, and so on canbe expressed in RDF with the N3 extension to formulae.

The log: namespace has functions, which are built-in meaning for cwm, and in some cases have been also used by other code.

log:implies

This implication links two formulae. The cwm rule engine recognises implies as a primitive, and will, when asked to process a rule file, look for any top level implication and find all matches in the store with the left hand side, generating the corresponding conclusion in each case.

log:Truth

A class of all true formulae.

(The cwm engine will proces rules in the (indirectly command-line specified) formula or any formula which it declares to be a Truth.

The dereifier will output any described formulae whcih are described as being in the class Truth.

log:semantics

The relation between a document and the logical expression it parses to.

log:includes