This is a language which is a compact and readable alternative to RDF's XML syntax, but also is extended to allow greater expressiveness. It has subsets, one of which is RDF 1.0 equivalent, and one of which is RDF plus a form of RDF rules.
Aims:
The language achieves these with the following features:
Resources on Notation3 include the following:
Syntax definition of N3 (in RDF/XML; in RDF/N3)
For various purposes it is useful to define limited subsets of the language.
NTriples is an extremely constrained language which uses hardly any syntactic sugar at all: it is optimized for reading by scripts, and comparing using text tools. It just allows one triple on each line. It was designed for the RDF test suite parser reference output. (It is not committed to be an exact subset of N3, and currently specified uses a \uXXXX form for encoding unicode characters in URIs, in variance with N3, and the IRI drafts which use %XX encoding of utf-8. Currently this output option is a --n3=u flag in cwm).
Turtle is another subset, for only expressing RDF. It is like n3-rdf below except that it does not have the path syntax.
Grammar for N3 is available in RDF, and I have made a few subsets which may be useful, below. I haven't tested the subsets much, mail me if you are interested in using them in earnest.
N3 the full language | N3 | RDF/XML | HTML | yacc |
N3-rdf (under development). This is an N3 language which is constrained so that only correct RDF graphs can be defined. This is all you need for data. | N3 | RDF/XML | HTML | |
N3-rules (under development). This subset allows {} only for making rules like {...} =>{...}, to be equivalent to various other rule languages out there. | N3 | RDF/XML | HTML | |
N3-QL (under development) Restrictions very similar to N3-Rules | N3 |
Feature | Expresses RDF 1.0 | @prefix
[], ; a |
Collections | Numeric literals (int) | Numeric literals (float) | Literal subj | RDF Path | Rules | Formulae |
syntax: | (<a> <b>) | 2 | 7 a n:prime. | x!y^z | {?x}=>{?x} | {} @forAll | |||
NTriples | y | ||||||||
Turtle | y | y | y | y 2004/2 | |||||
N3 RDF | y | y | y | y | y | y | y | ||
N3 Rules | y plus | y | y | y | y | y | y | y | |
N3 | y plus | y | y | y | y | y | y | y | y |
see also SWAG: Notation3 to RDF Converter, which (as of Aug 2001) is more actively maintained than this form.
N3 files are encoded in UTF-8 (See RFC2279), in normalized in Normalization Form C. The language is defined in terms of a sequence of Unicode characters. (Implementations may chose to implement using 8-bit bytes, passing bytes >7F transparently, but this will not allow them to check the validity of embedded non-ASCII characters. A full unicode implementation would in some ways be preferable. The exact set of unicode character to be allowed in (a) literal strings or (b) identifiers is not yet clear.)
This document defines (more or less) the allowable syntax and, for
documents of valid syntax the meaning, of a document in a proposed
text/rdf+n3
mime type. This mime type is not used with a
charset parameter: the encoding is always utf-8.
(The type application/n3 was applied for at one point (2002?) but I have no trace of any reply from IANA.)
The original hand-written grammar is a bit crufty; several more formal grammars have been developed. The n3.n3 grammar is definitive.
Parser | level | written in | Author | Notes |
notation3.py | N3 | Python | Connolly & Berners-Lee | Handwritten, used in cwm, W3C open source |
RDF/N3 parser. | ? | Python | Graham Kline | |
n3.py | No () or {} | Python | Sean Palmer | 2002. Still extant? Part of Eep |
afon | N3 | Python | Sean Palmer | Uses regexps. Roughly same speed as notation3.py |
RDF::Notation3 | ? | Perl | Ginger Alliance | |
notation3 parser | N3 | Java | Jos de Roo | Part of Agfa's Euler |
N3 parser (?) | ? | Java | Andy SeabourneHP Labs | Part of HP's Jena |
parser | ~Turtle, sans lists | PHP | Gunnar Grimnes | Based on SBP's n3.py, GPL |
RAP | N3-rdf? | PHP | chris@bizer.de (Chris Bizer) | RDF API for PHP is a software package for searching, manipulating, and serving RDF models, integrated RDF/XML, N3 and N-Triple parser and serializer. |
Raptor | Turtle | C | Dave Becket | Redland compatible |
SWI-Prolog | Turtle | prolog | JanWielemaker | parser |
n3.bnf | ? | blindfold | Sandro Hawke | Blindfold is a bnf-driven parser. |
flaten3 | ? | lex, yacc | Sandro Hawke | "A first pass at an n3 parser using lex and yacc" |
n3spark.py | ? | spark | Sandro Hawke | A re-implementation of RDF/n3 syntax using the SPARK toolsAug 2001; discussion |
rfdn3-gram | ? | yapps | Dan Connolly | a Yapps grammar for RDF Notation 3Aug 2001; discussion |
Eulermoz | ? | Javascript | Euler team | Euler-inspired inference in Mozilla |
n3p.py | ? | Python | Sean Palmer | See email |
This list may be inaccurate and probably out of date. Mail me with differences you know about, and if you are using this stuff check the web sites and google for new implementations
Here are some manifests of test files. A positive parser test is one which an N3 parser should parse. It has a set of NTriples which should be produced. A negative parser test is a file which should produce an error when an N3 parser tries it.
Tests | Level | Author | Notes |
---|---|---|---|
SWAP | N3 | Scharf et al | Positive and negative parser tests. |
Turtle tests | Turtle | DavidBeckett | Positive and negative parser tests |
NTriples examples | NTriples | ? | Do tests exist? Maybe examples from the RDF specs. |
Parser implementers are encouraged to generate test results in RDF in the same form, so that results from multiple implementations can be tabulated.
Tokenizing not explicitly specified in that white space is not in the BNF for simplicity here. White space must be inserted whenever the following token could start with a character which could extend the preceding token. Whitespace may not be inserted within a token. An exception may be that we allow and remove whitespace within a URI in angle brackets.
All URIs are quoted with angle brackets. Whitespace within the <> is to be ignored. Whitespace may therefore be usd on output to split a long URI between lines.
Qualified names have colons, so unquoted alphanumerics are all keywords,
unless the @keywords directive is given, in which case the
keywords given are keywords and anything else is a localname in the default
namespace. Any keyword may be given even if not in the keyword list by
prefacing it with "@". Because keywords are declared in this way, we will
have the freedom later to make extensions to the syntax using new keywords
without fear of ambiguity. However, the tokenizer has to be aware of the
@keywords
setting.
The @prefix directive binds a prefix to a namespace URI. It indicates that a qname with that prefix is a shorthand for a URI consisting of the concatenation of the namespace identifier and the bit of the qname to the rigtht of the (only allowed) colon.
The empty prefix "" may be, and is by default (2004/3 on), bound to the
empty URI "". this means that <#foo>
can be written
:foo
saving two bytes(!). With the @keywords
one
can reduce that to foo
.
The """value""" string form is used simply for multi-line values or values containg quote marks.
(To do:Define literal data to arbitary terminator ( """"""zyzzy"This is a string"zyzzy"""""") or something?)
The Unicode sets are not well defined - be conservative in what you use and liberal in what you accept. This may be the subject of futher study. (This should be somebody else's problem, in that the set of name characters is common for many langauges. In this case, the treatment of unicode characters outside the ASCII set is done compantibly with the XML1.1 standard.
In property lists, the semicolon ; is shorthand for repeating the subject, in object lists the comma "," is shorthand for repeating the verb.
[ pl ] means x, where there exists some x such that pl holds. So,
[ :firstname "Ora" ] dc:wrote [ dc:title "Moby Dick" ]
.
is a statement (false, I suspect) which would be means in math
exists x, y . firstname(x, "Ora") & dc:wrote(x,y) & dc:title (y, "Moby Dick")
or in english "Some person who has a first name Ora wrote a book entitled "Moby Dick". Note not "the book" or "the person".
This can equally well be written
[x:firstname "Ora" ; dc:wrote [dc:title "Moby Dick" ]].
or
[] x:firstname "Ora" ; dc:wrote [dc:title "Moby Dick" ].
These are just shorthand. x!p means [ is p of x ] in the above annonymous node notaion. You can read it as "x's p". This is a liitle reminiscent of the "." in object oriented programming "object.slot" syntax. (Note the "." could in fact historically be used instead of "!" but if it is it must be immediately followed by the next path element with no whitespace. This is to distinguish it from the trailing "." of a statement. The tokenizer needs to look ahead one character to resolve these. This use of "." is deprocated.)
The reverse traversal, x^p means [ p x ] . For either forward or backward traversal, p is a property, and x can be a whole path with both ! and ^ in it. Any path with at least one traversal is anonymous.
Example:
:joe!fam:mother!loc:office!loc:zip
Joe's mother's
office's zipcode
:joe.fam:mother^fam:mother
Anyone whose mother
is Joe's mother.
Note: The path traversal operator was just "!" and then was "." and currently is either. This is an open design question and parsers should accept either, but generators should only use "!".
An RDF document parses to a set of statements. N3 allows such a set to be itself to be referred to within the language, and calls it a formula. A { statementlist } is a formula whose meaning is the the logical conjunction (equivalent to syntactic juxtaposition) of the statements in the list. It is a set, as the same statement occuring more than once has no meaning. It is unordered set. It is an RDF graph.
Apart from the set of statements, and extending the basic RDF graph, it also has a set of URIs of symbols which are universally quantified, and a set of URIs of symbols which are existentially quanitified.
{ [ x:firstname "Ora" ] dc:wrote [ dc:title "Moby Dick" ] } a n3:falsehood .
This claims that the expression in {} is false - that there is nothing called Ora which wrote anything titled "Moby Dick".
A formula is considered, like a literal string, to be defined only by its contents.
As well as a set of statements, a formula comprises two sets, one of URIs which will be used as universal variables, and one URIs which will be used as existential variables.
The semantics of a formula are than the contents are quoted. Variable substitution does recursively take place within a formula, but substitution of equals does not. The variable substitution is used for example when formulae are used for rules, and patch file formats. See the tutorial introduction to rules.
Certain propoerties may, by their semantics, allow the propagataion of substitution of equals, by agents which are aware of that semantics. So for example, if the statement { F ex:or G } is true, where F and G are formulae, then it ise useful to define a disjunction operator ex:or such that if a = b, then it is also true that { F ex:or G' } where G' is the result or substituiting b for a in G.
The terminology is that the set of statements is a formula; the particular formula is which a statement is found is its context. "Formula" is therefore the class , and "context" is the relationship between statement and a formula.
N3 allows the _: namespace as in NTriples. These identifiers are used to identify blank nodes in the graph. These are generalized in N3 such that they are used to identify blanknodes in the local formula. They are arbitrary temporary names for nodes which are existentially quanitified within the current formula (not the whole file).
The @forAll
directives declare variables which are
universally quantified: the formula is true for any value of the variable.
Similarly, @forSome
gives an existential quantification: there
exists some value of the variable for which the context is true. (Note also
that the square bracket notation, introduces a blank node, which is an
unnamed anonymous existential variable, and aslo the _: namespace from the
NTriples spec which is a dummy namespace in which is used to represent nodes
which are existentially quantified and unnamed.)
{ @forSome <#g>. <#g> <#loves> <#you> } .
is equivalent to
[ <#loves> <#you> ] .
If both universal and existential quantification are specified for the same context, then the scope of the universal quantification is outside the scope of the existentials:
{ @forAll <#h>. @forSome <#g>. <#g> <#loves> <#h> }.
means
for all h for all g g loves h
("Every has someone who loves them" rather than "Somebody loves everybody")
which you might think of as
∀h(∃g(loves(g,h))
Escaping in strings uses the same conventions as Python strings except for a \U extension introduced by NTriples spec. N3 strings represent ordered sequences of Unicode characters.
Some escapes (\a, \b, \f, \v) should be avoided because the corresponding characters are not allowed in RDF. This is not yet (2001/3) implemented in the cwm parser
Escape Sequence | Meaning |
---|---|
\newline |
Ignored |
\\ |
Backslash (\ ) |
\' |
Single quote (' ) |
\" |
Double quote (" ) |
\n |
ASCII Linefeed (LF) |
\r |
ASCII Carriage Return (CR) |
\t |
ASCII Horizontal Tab (TAB) |
\ooo |
byte with octal value ooo (depreciated**) |
\xhh |
byte with hex value hh (depreciated**) |
\uhhhh |
character in BMP with Unicode value U+hhhh |
\U00hhhhhh |
character in plane 1-16 with Unicode value U+hhhhhh |
Note that in N3, the double quote character is used for strings. The single quote character is reserved for future use. The single quote character does not need to be escaped in an N3 string.
**RDF and N3 are defined in terms of characters, not bytes. Therefore, the \ooo and \xhh escapes are deprecated.
The hexadecimal digit as in unicode escapes are UPPERCASE. This is designed to match the NTriples strings.
This syntax does not allow minus signs in identifiers, whereas the XML encoding for RDF does.
The current solution is mapping sequences of "-" and "_" into sequences of "_" by taking a contiguous sequence of - and _, replacing _ with 0 and - with 1, then adding a leading "1", taking what you have as a binary number, subtracting 1 from the result, and then writing that many _ signs. The mapping is 1:1, and maps the simple case of - onto __ and _ onto _. The only disadvantage is that those who go crazy with n consecutive occurrences of - and/or _ in XML will pay for it in an even crazier 2**n long sequence in the N3. (2001/12/4)
A messy thing from the N3 point of view is that for content negotiation to work between RDF/XML and N3, the fragment identifier sysntax must be the same, and this would suggest that both use the "-" (XML) form.
It would be nice to be able to outlaw - in IDs for RDF.
The N3 grammar describes the syntax for various literal productions. These syntax strings identify values which are members of various classes of number. The bnf productions should not be confused with the relationship between number classes. In the syntax, integer, rational and real productions are distinct. When it comes to the values they represent, all integers are also rational numbers, and all rationals are also real numbers.
My assumption about the number model is that there is no distinction between the rationals 1.0 and 1/1 and the the integer 1. I haven't implemented this yet, but my intent is that the semantics of these numbers is true to normal mathematical equality.
The issue of reals is more complicated, as any real literal is necessary approximate. So while there is a real number which is equal to the rationals 1 or 1/3, reals do not support this comparison. (See XML Schema Datatypes)
Whilst complex numbers (with integer, rational or real parts) are a reasonable class to add, I don't see it as a priority at the moment.
Notation3, of any level, can be represneted as an RDF graph using a
vocabulary <http://www.w3.org/2004/06/rei#>
. Code exists
(cwm --reify
and cwm --dereify
) to convert N3 to
such a description and back. This allows, for example, rule files to be
manipulated as RDF data.
This reification also allows an RDF graph to be described, quoted, within another RDF graph. (Note that the "reification" described in the RDF 1.0 specification is different, doe snot quote properly, and is not recommended.)
A notation3 document has fragment identifiers of the form of alphanumeric strings with inital alpha, underscores allowed but not minus signs.
To label something you just invent that name and use it. There is no distinction between definition and reference. This is a fact of life, not of RDF or notation3. (A definitive reference is one in a document demonstrated to be definitive with respect to the namespace... but that is another story.)
<#ora> x:firstname "Ora".
<#ora> x:lastname "Lassila".
or indeed
:ora x:firstname "Ora"; x:lastname "Lassila".
A summary of recent design issues in Notation3:
Much detail on other and old issues are listed separately.
Dan Connolly wrote the first N3 parser. Sean Palmer and other folks on irc://openprojects.net/rdfig suggested many things and reviewed new ideas (and scrapped old ones!). Thanks to all implementers.
Added ( node node ...) list shorthand, as code now reads and writes it.
Added a little about self-describing documents.
$Log: Notation3.html,v $
Revision 1.123 2005/01/12 17:07:57 timbl
(timbl) Changed through Jigsaw.
Revision 1.122 2004/12/16 02:53:49 timbl
(timbl) Changed through Jigsaw.
Revision 1.120 2004/11/17 17:50:28 timbl
(timbl) Changed through Jigsaw.
Revision 1.117 2004/11/03 20:22:28 timbl
(timbl) Changed through Jigsaw.
Revision 1.115 2004/11/03 20:08:18 timbl
(timbl) Changed through Jigsaw.
Revision 1.113 2004/07/03 13:33:38 timbl
editing
Revision 1.112 2004/06/25 11:16:19 timbl
(timbl) Changed through Jigsaw.
Revision 1.103 2004/06/24 19:16:30 timbl
(timbl) Changed through Jigsaw.
Revision 1.99 2004/06/11 13:31:52 timbl
(timbl) Changed through Jigsaw.
Revision 1.98 2004/06/08 18:48:53 timbl
Change mime type?
Revision 1.97 2004/04/16 15:13:48 timbl
plit out olf grammar.
Revision 1.96 2004/03/01 15:41:29 timbl
(timbl) Changed through Jigsaw.
Revision 1.95 2004/02/01 04:28:29 timbl
Add table of parsers
Revision 1.94 2004/01/31 22:43:10 timbl
Add table of features
Revision 1.91 2003/12/02 21:54:40 timbl
oops
Revision 1.90 2003/12/02 21:34:25 timbl
pointers to bnf
Revision 1.89 2003/12/02 21:33:31 timbl
pointers to bnf
Revision 1.88 2003/10/31 17:58:17 timbl
Fixed conflicts
Revision 1.87 2003/09/09 16:30:07 timbl
(timbl) Changed through Jigsaw.
Revision 1.86 2003/04/01 15:37:51 timbl
(timbl) Changed through Jigsaw.
Revision 1.85 2003/03/19 19:13:35 timbl
(timbl) Changed through Jigsaw.
Revision 1.81 2003/03/18 16:03:12 timbl
(timbl) Changed through Jigsaw.
Revision 1.80 2002/10/19 19:53:18 timbl
DRAFT
Revision 1.79 2002/08/16 22:22:44 timbl
(timbl) Changed through Jigsaw.
Revision 1.68 2002/05/31 20:42:47 timbl
(timbl) Changed through Jigsaw.
Revision 1.67 2002/05/24 16:17:32 timbl
(timbl) Changed through Jigsaw.
Revision 1.66 2002/04/03 19:48:40 timbl
(timbl) Changed through Jigsaw.
Revision 1.64 2002/03/15 21:22:37 timbl
(timbl) Changed through Jigsaw.
Revision 1.63 2002/03/15 13:24:52 timbl
(timbl) Changed through Jigsaw.
Revision 1.62 2002/02/05 23:29:42 duerst
(duerst) Changed through Jigsaw.
Revision 1.61 2002/01/23 19:30:41 timbl
(timbl) Changed through Jigsaw.
Revision 1.58 2002/01/11 19:10:37 connolly
(connolly) Changed through Jigsaw.
Revision 1.56 2001/12/05 20:05:56 timbl
(timbl) Changed through Jigsaw.
Revision 1.53 2001/12/04 18:39:38 timbl
(timbl) Changed through Jigsaw.
Revision 1.50 2001/11/28 20:20:01 timbl
(timbl) Changed through Jigsaw.
Revision 1.49 2001/11/27 23:59:33 timbl
(timbl) Changed through Jigsaw.
Revision 1.48 2001/11/27 23:58:39 timbl
(timbl) Changed through Jigsaw.
Revision 1.47 2001/11/27 21:11:31 timbl
(timbl) Changed through Jigsaw.
Revision 1.46 2001/11/27 21:05:12 timbl
(timbl) Changed through Jigsaw.
Revision 1.42 2001/11/19 20:43:17 connolly
(connolly) Changed through Jigsaw.
Revision 1.41 2001/09/18 12:19:52 timbl
(timbl) Changed through Jigsaw.
Revision 1.40 2001/09/01 03:16:35 connolly
(connolly) Changed through Jigsaw.
Revision 1.38 2001/04/10 03:16:01 duerst
(duerst) Changed through Jigsaw.
Revision 1.34 2001/04/03 23:09:40 connolly
fixing n3 example in textara again...
Revision 1.33 2001/04/03 23:08:10 connolly
oops fixed textarea
Revision 1.32 2001/04/03 23:07:19 connolly
updated form
Revision 1.31 2001/04/01 20:12:22 timbl
(timbl) Changed through Jigsaw.
Revision 1.27 2001/03/16 18:22:47 timbl
(timbl) Changed through Jigsaw.
Revision 1.25 2001/03/15 15:33:30 timbl
(timbl) Changed through Jigsaw.
Revision 1.23 2001/02/02 02:43:20 timbl
updates
Revision 1.22 2001/01/05 23:04:01 timbl
more on quantifiers
Moved to a separate page
See also links in the text.
D. Beckett, New Syntaxes for RDF summarizes the state of the art as of 2004.
N3QL, a draft proposal for an RDF query language.
These are not part of the N3 language, but are properties which allow N3 to be used to express rules, and rules which talk about the provence of information, etc. Just as OWL is expressed in RDF by defining properties, so rules, queries, differences, and so on canbe expressed in RDF with the N3 extension to formulae.
The log: namespace has functions, which are built-in meaning for cwm, and in some cases have been also used by other code.
See also:
The prefix are in the namespace <http://www.w3.org/2000/10/logic.n3#>. Check the schema for the low-down - here are some highlights
This implication links two formulae. The cwm rule engine recognises implies as a primitive, and will, when asked to process a rule file, look for any top level implication and find all matches in the store with the left hand side, generating the corresponding conclusion in each case.
A class of all true formulae.
(The cwm engine will proces rules in the (indirectly command-line specified) formula or any formula which it declares to be a Truth.
The dereifier will output any described formulae whcih are described as being in the class Truth.
The relation between a document and the logical expression it parses to.