graph/formats/rdf: new package for RDF N-Quad parsing

This code is based on the RDF N-Quad parsing code that I wrote for the Cayley
graph database project in 2014. The code here does not include any code that
was written by other members of the Cayley project and so is unencumbered by
copyright ownership from that project.

License addition is for the test suite from [1] linked from [2]. A second more
restrictive license is possible if we are claiming spec compliance[3].

[1]https://www.w3.org/Consortium/Legal/2008/03-bsd-license
[2]https://www.w3.org/Consortium/Legal/2008/04-testsuite-copyright.html
[3]https://www.w3.org/Consortium/Legal/2008/04-testsuite-license.html
This commit is contained in:
Dan Kortschak
2021-01-13 20:31:06 +10:30
committed by GitHub
parent 6568654ff6
commit d39af6a71b
30 changed files with 8169 additions and 6 deletions

View File

@@ -11,6 +11,6 @@ jobs:
uses: golangci/golangci-lint-action@v2.3.0 uses: golangci/golangci-lint-action@v2.3.0
with: with:
# Required: the version of golangci-lint is required and must be specified without patch version: we always use the latest patch version. # Required: the version of golangci-lint is required and must be specified without patch version: we always use the latest patch version.
version: v1.28 version: v1.34
only-new-issues: true only-new-issues: true
args: --timeout=5m args: --timeout=5m

View File

@@ -1,5 +1,7 @@
sudo: false sudo: false
dist: bionic
language: go language: go
# Do not move these lines; they are referred to by README.md. # Do not move these lines; they are referred to by README.md.
@@ -89,6 +91,11 @@ matrix:
before_install: before_install:
- ${TRAVIS_BUILD_DIR}/.travis/run-parts ${TRAVIS_BUILD_DIR}/.travis/deps.d/${TRAVIS_OS_NAME} - ${TRAVIS_BUILD_DIR}/.travis/run-parts ${TRAVIS_BUILD_DIR}/.travis/deps.d/${TRAVIS_OS_NAME}
addons:
apt:
packages:
- ragel
go_import_path: gonum.org/v1/gonum go_import_path: gonum.org/v1/gonum
# Get deps, build, test, and ensure the code is gofmt'ed. # Get deps, build, test, and ensure the code is gofmt'ed.

View File

@@ -1,4 +1,4 @@
#!/bin/bash #!/bin/bash
set -e set -e
check-copyright -notice "Copyright ©20[0-9]{2} The Gonum Authors\. All rights reserved\." check-copyright -notice "(Copyright ©20[0-9]{2} The Gonum Authors\. All rights reserved\.|[Cc]ode generated by .*; DO NOT EDIT\.)"

View File

@@ -9,6 +9,7 @@ go generate gonum.org/v1/gonum/blas/gonum
go generate gonum.org/v1/gonum/unit go generate gonum.org/v1/gonum/unit
go generate gonum.org/v1/gonum/unit/constant go generate gonum.org/v1/gonum/unit/constant
go generate gonum.org/v1/gonum/graph/formats/dot go generate gonum.org/v1/gonum/graph/formats/dot
go generate gonum.org/v1/gonum/graph/formats/rdf
go generate gonum.org/v1/gonum/stat/card go generate gonum.org/v1/gonum/stat/card
if [ -n "$(git diff)" ]; then if [ -n "$(git diff)" ]; then

View File

@@ -18,6 +18,8 @@ go get golang.org/x/tools/cmd/cover
go get github.com/mattn/goveralls go get github.com/mattn/goveralls
# Required for dot parser checks. # Required for dot parser checks.
go get github.com/goccmack/gocc@66c61e9 go get github.com/goccmack/gocc@66c61e9
# Required for rdf parser checks.
go get golang.org/x/tools/cmd/stringer
# Clean up. # Clean up.
# TODO(kortschak): Remove when golang/go#30515 is resolved. # TODO(kortschak): Remove when golang/go#30515 is resolved.

View File

@@ -58,3 +58,5 @@ https://groups.google.com/forum/#!forum/gonum-dev
Original code is licensed under the Gonum License found in the LICENSE file. Portions of the code are subject to the additional licenses found in THIRD_PARTY_LICENSES. All third party code is licensed either under a BSD or MIT license. Original code is licensed under the Gonum License found in the LICENSE file. Portions of the code are subject to the additional licenses found in THIRD_PARTY_LICENSES. All third party code is licensed either under a BSD or MIT license.
Code in graph/formats/dot is dual licensed [Public Domain Dedication](https://creativecommons.org/publicdomain/zero/1.0/) and Gonum License, and users are free to choose the license which suits their needs for this code. Code in graph/formats/dot is dual licensed [Public Domain Dedication](https://creativecommons.org/publicdomain/zero/1.0/) and Gonum License, and users are free to choose the license which suits their needs for this code.
The W3C test suites in graph/formats/rdf are distributed under both the [W3C Test Suite License](http://www.w3.org/Consortium/Legal/2008/04-testsuite-license) and the [W3C 3-clause BSD License](http://www.w3.org/Consortium/Legal/2008/03-bsd-license).

View File

@@ -0,0 +1,26 @@
Copyright © 2008 World Wide Web Consortium®, (MIT, ERCIM, Keio, Beihang) and others.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of works must retain the original copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the original copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the W3C nor the names of its contributors may be
used to endorse or promote products derived from this work without
specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

View File

@@ -0,0 +1,56 @@
Copyright © 2008 World Wide Web Consortium, (MIT, ERCIM, Keio, Beihang) and others. All Rights Reserved.
http://www.w3.org/Consortium/Legal/2008/04-testsuite-copyright.html
This document, Test Suites and other documents that link to this statement are
provided by the copyright holders under the following license: By using and/or
copying this document, or the W3C document from which this statement is linked,
you (the licensee) agree that you have read, understood, and will comply with
the following terms and conditions:
Permission to copy, and distribute the contents of this document, or the W3C
document from which this statement is linked, in any medium for any purpose and
without fee or royalty is hereby granted, provided that you include the following
on ALL copies of the document, or portions thereof, that you use:
1. A link or URL to the original W3C document.
2. The pre-existing copyright notice of the original author, or if it doesn't
exist, a notice (hypertext is preferred, but a textual representation is
permitted) of the form: "Copyright © [$date-of-document] World Wide Web
Consortium, (MIT, ERCIM, Keio, Beihang) and others. All Rights Reserved.
http://www.w3.org/Consortium/Legal/2008/04-testsuite-copyright.html"
3. If it exists, the STATUS of the W3C document.
When space permits, inclusion of the full text of this NOTICE should be provided.
We request that authorship attribution be provided in any software, documents,
or other items or products that you create pursuant to the implementation of the
contents of this document, or any portion thereof.
No right to create modifications or derivatives of W3C documents is granted
pursuant to this license. However, if additional requirements (documented in the
Copyright FAQ) are satisfied, the right to create modifications or derivatives
is sometimes granted by the W3C to individuals complying with those requirements.
If a Test Suite distinguishes the test harness (or, framework for navigation) and
the actual tests, permission is given to remove or alter the harness or navigation
if the Test Suite in question allows to do so. The tests themselves shall NOT be
changed in any way.
The name and trademarks of W3C and other copyright holders may NOT be used in
advertising or publicity pertaining to this document or other documents that link
to this statement without specific, written prior permission. Title to copyright
in this document will at all times remain with copyright holders. Permission is
given to use the trademarked string "W3C" within claims of performance concerning
W3C Specifications or features described therein, and there only, if the test
suite so authorizes.
THIS WORK IS PROVIDED BY W3C, MIT, ERCIM, KEIO, BEIHANG, THE COPYRIGHT HOLDERS
AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL W3C, MIT, ERCIM, KEIO,
BEIHANG, THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

View File

@@ -0,0 +1,10 @@
Test suite license
This document refers to nquad_tests.tar.gz and ntriple_tests.tar.gz. The original files can be obtained here:
- [nquad_tests.tar.gz](https://w3c.github.io/rdf-tests/nquads/TESTS.tar.gz)
- [ntriple_tests.tar.gz](https://w3c.github.io/rdf-tests/ntriples/TESTS.tar.gz)
Distributed under both the [W3C Test Suite License](https://www.w3.org/Consortium/Legal/2008/04-testsuite-license) and the [W3C 3-clause BSD License](https://www.w3.org/Consortium/Legal/2008/03-bsd-license).
To contribute to a W3C Test Suite, see the [policies and contribution forms](href="https://www.w3.org/2004/10/27-testcases").

526
graph/formats/rdf/check.go Normal file
View File

@@ -0,0 +1,526 @@
//line check.rl:1
// Go code generated by go generate gonum.org/v1/gonum/graph/formats/rdf; DO NOT EDIT.
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package rdf
import (
"fmt"
"unicode"
)
//line check.go:18
const checkLabel_start int = 1
const checkLabel_first_final int = 3
const checkLabel_error int = 0
const checkLabel_en_value int = 1
//line check.rl:53
func checkLabelText(data []rune) (err error) {
var (
cs, p int
pe = len(data)
eof = pe
)
//line check.go:37
{
cs = checkLabel_start
}
//line check.rl:63
//line check.go:45
{
if p == pe {
goto _test_eof
}
switch cs {
case 1:
goto st_case_1
case 0:
goto st_case_0
case 3:
goto st_case_3
case 2:
goto st_case_2
}
goto st_out
st_case_1:
if data[p] == 95 {
goto st3
}
switch {
case data[p] < 895:
switch {
case data[p] < 192:
switch {
case data[p] < 65:
if 48 <= data[p] && data[p] <= 58 {
goto st3
}
case data[p] > 90:
if 97 <= data[p] && data[p] <= 122 {
goto st3
}
default:
goto st3
}
case data[p] > 214:
switch {
case data[p] < 248:
if 216 <= data[p] && data[p] <= 246 {
goto st3
}
case data[p] > 767:
if 880 <= data[p] && data[p] <= 893 {
goto st3
}
default:
goto st3
}
default:
goto st3
}
case data[p] > 8191:
switch {
case data[p] < 12289:
switch {
case data[p] < 8304:
if 8204 <= data[p] && data[p] <= 8205 {
goto st3
}
case data[p] > 8591:
if 11264 <= data[p] && data[p] <= 12271 {
goto st3
}
default:
goto st3
}
case data[p] > 55295:
switch {
case data[p] < 65008:
if 63744 <= data[p] && data[p] <= 64975 {
goto st3
}
case data[p] > 65533:
if 65536 <= data[p] && data[p] <= 983039 {
goto st3
}
default:
goto st3
}
default:
goto st3
}
default:
goto st3
}
goto tr0
tr0:
//line check_actions.rl:12
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return fmt.Errorf("%w: unexpected rune %q at %d", ErrInvalidTerm, data[p], p)
} else {
return fmt.Errorf("%w: unexpected rune %q (\\u%04[2]x) at %d", ErrInvalidTerm, data[p], p)
}
}
return ErrIncompleteTerm
goto st0
//line check.go:145
st_case_0:
st0:
cs = 0
goto _out
st3:
if p++; p == pe {
goto _test_eof3
}
st_case_3:
switch data[p] {
case 45:
goto st3
case 46:
goto st2
case 95:
goto st3
case 183:
goto st3
}
switch {
case data[p] < 8204:
switch {
case data[p] < 192:
switch {
case data[p] < 65:
if 48 <= data[p] && data[p] <= 58 {
goto st3
}
case data[p] > 90:
if 97 <= data[p] && data[p] <= 122 {
goto st3
}
default:
goto st3
}
case data[p] > 214:
switch {
case data[p] < 248:
if 216 <= data[p] && data[p] <= 246 {
goto st3
}
case data[p] > 893:
if 895 <= data[p] && data[p] <= 8191 {
goto st3
}
default:
goto st3
}
default:
goto st3
}
case data[p] > 8205:
switch {
case data[p] < 12289:
switch {
case data[p] < 8304:
if 8255 <= data[p] && data[p] <= 8256 {
goto st3
}
case data[p] > 8591:
if 11264 <= data[p] && data[p] <= 12271 {
goto st3
}
default:
goto st3
}
case data[p] > 55295:
switch {
case data[p] < 65008:
if 63744 <= data[p] && data[p] <= 64975 {
goto st3
}
case data[p] > 65533:
if 65536 <= data[p] && data[p] <= 983039 {
goto st3
}
default:
goto st3
}
default:
goto st3
}
default:
goto st3
}
goto st0
st2:
if p++; p == pe {
goto _test_eof2
}
st_case_2:
switch data[p] {
case 45:
goto st3
case 46:
goto st2
case 95:
goto st3
case 183:
goto st3
}
switch {
case data[p] < 8204:
switch {
case data[p] < 192:
switch {
case data[p] < 65:
if 48 <= data[p] && data[p] <= 58 {
goto st3
}
case data[p] > 90:
if 97 <= data[p] && data[p] <= 122 {
goto st3
}
default:
goto st3
}
case data[p] > 214:
switch {
case data[p] < 248:
if 216 <= data[p] && data[p] <= 246 {
goto st3
}
case data[p] > 893:
if 895 <= data[p] && data[p] <= 8191 {
goto st3
}
default:
goto st3
}
default:
goto st3
}
case data[p] > 8205:
switch {
case data[p] < 12289:
switch {
case data[p] < 8304:
if 8255 <= data[p] && data[p] <= 8256 {
goto st3
}
case data[p] > 8591:
if 11264 <= data[p] && data[p] <= 12271 {
goto st3
}
default:
goto st3
}
case data[p] > 55295:
switch {
case data[p] < 65008:
if 63744 <= data[p] && data[p] <= 64975 {
goto st3
}
case data[p] > 65533:
if 65536 <= data[p] && data[p] <= 983039 {
goto st3
}
default:
goto st3
}
default:
goto st3
}
default:
goto st3
}
goto tr0
st_out:
_test_eof3: cs = 3; goto _test_eof
_test_eof2: cs = 2; goto _test_eof
_test_eof: {}
if p == eof {
switch cs {
case 3:
//line check_actions.rl:8
return nil
case 1, 2:
//line check_actions.rl:12
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return fmt.Errorf("%w: unexpected rune %q at %d", ErrInvalidTerm, data[p], p)
} else {
return fmt.Errorf("%w: unexpected rune %q (\\u%04[2]x) at %d", ErrInvalidTerm, data[p], p)
}
}
return ErrIncompleteTerm
//line check.go:338
}
}
_out: {}
}
//line check.rl:65
return ErrInvalidTerm
}
//line check.go:351
const checkLang_start int = 1
const checkLang_first_final int = 4
const checkLang_error int = 0
const checkLang_en_value int = 1
//line check.rl:81
func checkLangText(data []byte) (err error) {
var (
cs, p int
pe = len(data)
eof = pe
)
//line check.go:370
{
cs = checkLang_start
}
//line check.rl:91
//line check.go:378
{
if p == pe {
goto _test_eof
}
switch cs {
case 1:
goto st_case_1
case 0:
goto st_case_0
case 2:
goto st_case_2
case 4:
goto st_case_4
case 3:
goto st_case_3
case 5:
goto st_case_5
}
goto st_out
st_case_1:
if data[p] == 64 {
goto st2
}
goto tr0
tr0:
//line check_actions.rl:12
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return fmt.Errorf("%w: unexpected rune %q at %d", ErrInvalidTerm, data[p], p)
} else {
return fmt.Errorf("%w: unexpected rune %q (\\u%04[2]x) at %d", ErrInvalidTerm, data[p], p)
}
}
return ErrIncompleteTerm
goto st0
//line check.go:416
st_case_0:
st0:
cs = 0
goto _out
st2:
if p++; p == pe {
goto _test_eof2
}
st_case_2:
switch {
case data[p] > 90:
if 97 <= data[p] && data[p] <= 122 {
goto st4
}
case data[p] >= 65:
goto st4
}
goto tr0
st4:
if p++; p == pe {
goto _test_eof4
}
st_case_4:
if data[p] == 45 {
goto st3
}
switch {
case data[p] > 90:
if 97 <= data[p] && data[p] <= 122 {
goto st4
}
case data[p] >= 65:
goto st4
}
goto st0
st3:
if p++; p == pe {
goto _test_eof3
}
st_case_3:
switch {
case data[p] < 65:
if 48 <= data[p] && data[p] <= 57 {
goto st5
}
case data[p] > 90:
if 97 <= data[p] && data[p] <= 122 {
goto st5
}
default:
goto st5
}
goto tr0
st5:
if p++; p == pe {
goto _test_eof5
}
st_case_5:
if data[p] == 45 {
goto st3
}
switch {
case data[p] < 65:
if 48 <= data[p] && data[p] <= 57 {
goto st5
}
case data[p] > 90:
if 97 <= data[p] && data[p] <= 122 {
goto st5
}
default:
goto st5
}
goto st0
st_out:
_test_eof2: cs = 2; goto _test_eof
_test_eof4: cs = 4; goto _test_eof
_test_eof3: cs = 3; goto _test_eof
_test_eof5: cs = 5; goto _test_eof
_test_eof: {}
if p == eof {
switch cs {
case 4, 5:
//line check_actions.rl:8
return nil
case 1, 2, 3:
//line check_actions.rl:12
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return fmt.Errorf("%w: unexpected rune %q at %d", ErrInvalidTerm, data[p], p)
} else {
return fmt.Errorf("%w: unexpected rune %q (\\u%04[2]x) at %d", ErrInvalidTerm, data[p], p)
}
}
return ErrIncompleteTerm
//line check.go:517
}
}
_out: {}
}
//line check.rl:93
return ErrInvalidTerm
}

View File

@@ -0,0 +1,95 @@
// Go code generated by go generate gonum.org/v1/gonum/graph/formats/rdf; DO NOT EDIT.
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package rdf
import (
"fmt"
"unicode"
)
%%{
machine checkLabel;
alphtype rune;
include check "check_actions.rl";
alphtype rune;
PN_CHARS_BASE = [A-Za-z]
| 0x00c0 .. 0x00d6
| 0x00d8 .. 0x00f6
| 0x00f8 .. 0x02ff
| 0x0370 .. 0x037d
| 0x037f .. 0x1fff
| 0x200c .. 0x200d
| 0x2070 .. 0x218f
| 0x2c00 .. 0x2fef
| 0x3001 .. 0xd7ff
| 0xf900 .. 0xfdcf
| 0xfdf0 .. 0xfffd
| 0x10000 .. 0xeffff
;
PN_CHARS_U = PN_CHARS_BASE | '_' | ':' ;
PN_CHARS = PN_CHARS_U
| '-'
| [0-9]
| 0xb7
| 0x0300 .. 0x036f
| 0x203f .. 0x2040
;
BLANK_NODE_LABEL = (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)? ;
value := BLANK_NODE_LABEL %Return @!Error ;
write data;
}%%
func checkLabelText(data []rune) (err error) {
var (
cs, p int
pe = len(data)
eof = pe
)
%%write init;
%%write exec;
return ErrInvalidTerm
}
%%{
machine checkLang;
alphtype byte;
include check "check_actions.rl";
LANGTAG = '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ;
value := LANGTAG %Return @!Error ;
write data;
}%%
func checkLangText(data []byte) (err error) {
var (
cs, p int
pe = len(data)
eof = pe
)
%%write init;
%%write exec;
return ErrInvalidTerm
}

View File

@@ -0,0 +1,22 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
%%{
machine check;
action Return {
return nil
}
action Error {
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return fmt.Errorf("%w: unexpected rune %q at %d", ErrInvalidTerm, data[p], p)
} else {
return fmt.Errorf("%w: unexpected rune %q (\\u%04[2]x) at %d", ErrInvalidTerm, data[p], p)
}
}
return ErrIncompleteTerm
}
}%%

8
graph/formats/rdf/doc.go Normal file
View File

@@ -0,0 +1,8 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package rdf implements decoding the RDF 1.1 N-Quads line-based plain text
// format for encoding an RDF dataset.
// N-Quad parsing is performed as defined by http://www.w3.org/TR/n-quads/
package rdf // import "gonum.org/v1/gonum/graph/formats/rdf"

1184
graph/formats/rdf/extract.go Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,52 @@
// Go code generated by go generate gonum.org/v1/gonum/graph/formats/rdf; DO NOT EDIT.
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package rdf
import (
"fmt"
"unicode"
)
%%{
machine extract;
include extract "extract_actions.rl";
include nquads "nquads.rl";
value := (
IRIREF
| '_:' BLANK_NODE_LABEL >StartBlank %EndBlank
| '"' STRING_LITERAL >StartLiteral %EndLiteral '"' ( '^^' IRIREF | LANGTAG >StartLang %EndLang )?
) %Return @!Error ;
write data;
}%%
func extract(data []rune) (text, qual string, kind Kind, err error) {
var (
cs, p int
pe = len(data)
eof = pe
iri = -1
blank = -1
literal = -1
lang = -1
iriText string
blankText string
literalText string
langText string
)
%%write init;
%%write exec;
return "", "", 0, ErrInvalidTerm
}

View File

@@ -0,0 +1,84 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
%%{
machine extract;
action StartIRI {
iri = p
}
action EndIRI {
if iri < 0 {
panic("unexpected parser state: iri start not set")
}
iriText = unEscape(data[iri:p])
if kind == Invalid {
kind = IRI
}
}
action StartBlank {
blank = p
}
action EndBlank {
if blank < 0 {
panic("unexpected parser state: blank start not set")
}
blankText = string(data[blank:p])
kind = Blank
}
action StartLiteral {
literal = p
}
action EndLiteral {
if literal < 0 {
panic("unexpected parser state: literal start not set")
}
literalText = unEscape(data[literal:p])
kind = Literal
}
action StartLang {
lang = p
}
action EndLang {
if lang < 0 {
panic("unexpected parser state: lang start not set")
}
langText = string(data[lang:p])
}
action Return {
switch kind {
case IRI:
return iriText, "", kind, nil
case Blank:
return blankText, "", kind, nil
case Literal:
qual = iriText
if qual == "" {
qual = langText
}
return literalText, qual, kind, nil
default:
return "", "", kind, ErrInvalidTerm
}
}
action Error {
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return "", "", Invalid, fmt.Errorf("%w: unexpected rune %q at %d", ErrInvalidTerm, data[p], p)
} else {
return "", "", Invalid, fmt.Errorf("%w: unexpected rune %q (\\u%04[2]x) at %d", ErrInvalidTerm, data[p], p)
}
}
return "", "", Invalid, ErrIncompleteTerm
}
}%%

View File

@@ -0,0 +1,26 @@
// Code generated by "stringer -type=Kind"; DO NOT EDIT.
package rdf
import "strconv"
func _() {
// An "invalid array index" compiler error signifies that the constant values have changed.
// Re-run the stringer command to generate them again.
var x [1]struct{}
_ = x[Invalid-0]
_ = x[IRI-1]
_ = x[Literal-2]
_ = x[Blank-3]
}
const _Kind_name = "InvalidIRILiteralBlank"
var _Kind_index = [...]uint8{0, 7, 10, 17, 22}
func (i Kind) String() string {
if i < 0 || i >= Kind(len(_Kind_index)-1) {
return "Kind(" + strconv.FormatInt(int64(i), 10) + ")"
}
return _Kind_name[_Kind_index[i]:_Kind_index[i+1]]
}

Binary file not shown.

View File

@@ -0,0 +1,83 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Ragel gramar definition derived from http://www.w3.org/TR/n-quads/#sec-grammar.
%%{
machine nquads;
alphtype rune;
PN_CHARS_BASE = [A-Za-z]
| 0x00c0 .. 0x00d6
| 0x00d8 .. 0x00f6
| 0x00f8 .. 0x02ff
| 0x0370 .. 0x037d
| 0x037f .. 0x1fff
| 0x200c .. 0x200d
| 0x2070 .. 0x218f
| 0x2c00 .. 0x2fef
| 0x3001 .. 0xd7ff
| 0xf900 .. 0xfdcf
| 0xfdf0 .. 0xfffd
| 0x10000 .. 0xeffff
;
PN_CHARS_U = PN_CHARS_BASE | '_' | ':' ;
PN_CHARS = PN_CHARS_U
| '-'
| [0-9]
| 0xb7
| 0x0300 .. 0x036f
| 0x203f .. 0x2040
;
BLANK_NODE_LABEL = (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)? ;
BLANK_NODE = '_:' BLANK_NODE_LABEL ;
ECHAR = ('\\' [tbnrf"'\\]) ;
UCHAR = ('\\u' xdigit {4}
| '\\U' xdigit {8})
;
STRING_LITERAL = (
0x00 .. 0x09
| 0x0b .. 0x0c
| 0x0e .. '!'
| '#' .. '['
| ']' .. 0x10ffff
| ECHAR
| UCHAR)*
;
STRING_LITERAL_QUOTE = '"' STRING_LITERAL '"' ;
IRI = (
'!' .. ';'
| '='
| '?' .. '['
| ']'
| '_'
| 'a' .. 'z'
| '~'
| 0x80 .. 0x10ffff
| UCHAR)*
;
IRIREF = '<' IRI >StartIRI %EndIRI '>' ;
LANGTAG = '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ;
whitespace = [ \t] ;
literal = STRING_LITERAL_QUOTE ('^^' IRIREF | LANGTAG)? ;
subject = IRIREF | BLANK_NODE ;
predicate = IRIREF ;
object = IRIREF | BLANK_NODE | literal ;
graphLabel = IRIREF | BLANK_NODE ;
}%%

Binary file not shown.

3624
graph/formats/rdf/parse.go Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,53 @@
// Go code generated by go generate gonum.org/v1/gonum/graph/formats/rdf; DO NOT EDIT.
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package rdf
import (
"fmt"
"net/url"
"unicode"
)
%%{
machine nquads;
include "parse_actions.rl";
include "nquads.rl";
statement := (
whitespace* subject >StartSubject %SetSubject
whitespace* predicate >StartPredicate %SetPredicate
whitespace* object >StartObject %SetObject
(whitespace* graphLabel >StartLabel %SetLabel)?
whitespace* '.' whitespace* ('#' any*)? >Comment
) %Return @!Error ;
write data;
}%%
func parse(data []rune) (Statement, error) {
var (
cs, p int
pe = len(data)
eof = pe
subject = -1
predicate = -1
object = -1
label = -1
iri = -1
s Statement
)
%%write init;
%%write exec;
return Statement{}, ErrInvalid
}

View File

@@ -0,0 +1,85 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
%%{
machine nquads;
action StartSubject {
subject = p
}
action StartPredicate {
predicate = p
}
action StartObject {
object = p
}
action StartLabel {
label = p
}
action StartIRI {
iri = p
}
action SetSubject {
if subject < 0 {
panic("unexpected parser state: subject start not set")
}
s.Subject.Value = string(data[subject:p])
}
action SetPredicate {
if predicate < 0 {
panic("unexpected parser state: predicate start not set")
}
s.Predicate.Value = string(data[predicate:p])
}
action SetObject {
if object < 0 {
panic("unexpected parser state: object start not set")
}
s.Object.Value = string(data[object:p])
}
action SetLabel {
if label < 0 {
panic("unexpected parser state: label start not set")
}
s.Label.Value = string(data[label:p])
}
action EndIRI {
if iri < 0 {
panic("unexpected parser state: iri start not set")
}
switch u, err := url.Parse(string(data[iri:p])); {
case err != nil:
return s, err
case !u.IsAbs():
return s, fmt.Errorf("%w: relative IRI ref %q", ErrInvalid, string(data[iri:p]))
}
}
action Return {
return s, nil
}
action Comment {
}
action Error {
if p < len(data) {
if r := data[p]; r < unicode.MaxASCII {
return s, fmt.Errorf("%w: unexpected rune %q at %d", ErrInvalid, data[p], p)
} else {
return s, fmt.Errorf("%w: unexpected rune %q (\\u%04[2]x) at %d", ErrInvalid, data[p], p)
}
}
return s, ErrIncomplete
}
}%%

387
graph/formats/rdf/rdf.go Normal file
View File

@@ -0,0 +1,387 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
//go:generate ragel -Z -G2 parse.rl
//go:generate ragel -Z -G2 extract.rl
//go:generate ragel -Z -G2 check.rl
//go:generate stringer -type=Kind
package rdf
import (
"bufio"
"bytes"
"errors"
"fmt"
"io"
"net/url"
"strconv"
"strings"
"unicode"
"unicode/utf8"
"gonum.org/v1/gonum/graph"
)
var (
_ graph.Node = Term{}
_ graph.Edge = (*Statement)(nil)
_ graph.Line = (*Statement)(nil)
)
var (
ErrInvalid = errors.New("invalid N-Quad")
ErrIncomplete = errors.New("incomplete N-Quad")
ErrInvalidTerm = errors.New("invalid term")
ErrIncompleteTerm = errors.New("incomplete term")
)
// Kind represents the kind of an RDF term.
type Kind int
const (
// Invalid is an invalid RDF term.
Invalid Kind = iota
// IRI is the kind of an IRI term.
// https://www.w3.org/TR/n-quads/#sec-iri
IRI
// Literal is the kind of an RDF literal.
// https://www.w3.org/TR/n-quads/#sec-literals
Literal
// Blank is the kind of an RDF blank node term.
// https://www.w3.org/TR/n-quads/#BNodes
Blank
)
// Term is an RDF term. It implements the graph.Node interface.
type Term struct {
// Value is the text value of term.
Value string
// UID is the unique ID for the term
// in a collection of RDF terms.
UID int64
}
// NewBlankTerm returns a Term based on the provided RDF blank node
// label. The label should not include the "_:" prefix. The returned
// Term will not have the UID set.
func NewBlankTerm(label string) (Term, error) {
err := checkLabelText([]rune(label))
if err != nil {
return Term{}, err
}
return Term{Value: "_:" + label}, nil
}
// NewIRITerm returns a Term based on the provided IRI which must
// be valid and include a scheme. The returned Term will not have
// the UID set.
func NewIRITerm(iri string) (Term, error) {
err := checkIRIText(iri)
if err != nil {
return Term{}, err
}
return Term{Value: escape("<", iri, ">")}, nil
}
// NewLiteralTerm returns a Term based on the literal text and an
// optional qualifier which may either be a "@"-prefixed language
// tag or a valid IRI. The text will be escaped if necessary and quoted,
// and if an IRI is given it will be escaped if necessary. The returned
// Term will not have the UID set.
func NewLiteralTerm(text, qual string) (Term, error) {
text = escape(`"`, text, `"`)
if qual == "" {
return Term{Value: text}, nil
}
if strings.HasPrefix(qual, "@") {
err := checkLangText([]byte(qual))
if err != nil {
return Term{}, err
}
return Term{Value: text + qual}, nil
}
err := checkIRIText(qual)
if err != nil {
return Term{}, err
}
return Term{Value: text + escape("^^<", qual, ">")}, nil
}
func checkIRIText(iri string) error {
switch u, err := url.Parse(iri); {
case err != nil:
return err
case u.Scheme == "":
return fmt.Errorf("rdf: %w: relative IRI ref %q", ErrInvalidTerm, iri)
default:
return nil
}
}
// Parts returns the pars of the term and the kind of the term.
// IRI node text is returned as a valid IRI with the quoting angle
// brackets removed and escape sequences interpreted, and blank
// nodes are stripped of the "_:" prefix.
// When the term is a literal, qual will either be empty, an unescaped
// IRI, or an RDF language tag prefixed with an @ symbol. The literal
// text is returned unquoted and unescaped.
func (t Term) Parts() (text, qual string, kind Kind, err error) {
return extract([]rune(t.Value))
}
// ID returns the value of the Term's UID field.
func (t Term) ID() int64 { return t.UID }
// Statement is an RDF statement. It implements the graph.Edge and graph.Line
// interfaces.
type Statement struct {
Subject Term
Predicate Term
Object Term
Label Term
}
// String returns the RDF 1.1 N-Quad formatted statement.
func (s *Statement) String() string {
if s.Label.Value == "" {
return fmt.Sprintf("%s %s %s .", s.Subject.Value, s.Predicate.Value, s.Object.Value)
}
return fmt.Sprintf("%s %s %s %s .", s.Subject.Value, s.Predicate.Value, s.Object.Value, s.Label.Value)
}
// From returns the subject of the statement.
func (s *Statement) From() graph.Node { return s.Subject }
// To returns the object of the statement.
func (s *Statement) To() graph.Node { return s.Object }
// ID returns the UID of the Predicate field.
func (s *Statement) ID() int64 { return s.Predicate.UID }
// ReversedEdge returns the receiver unaltered. If there is a semantically
// valid edge reversal operation for the data, the user should implement
// this by wrapping Statement in a type performing that operation.
// See the ReversedLine example for details.
func (s *Statement) ReversedEdge() graph.Edge { return s }
// ReversedLine returns the receiver unaltered. If there is a semantically
// valid line reversal operation for the data, the user should implement
// this by wrapping Statement in a type performing that operation.
func (s *Statement) ReversedLine() graph.Line { return s }
// ParseNQuad parses the statement and returns the corresponding Statement.
// All Term UID fields are zero on return.
func ParseNQuad(statement string) (*Statement, error) {
s, err := parse([]rune(statement))
if err != nil {
return nil, err
}
return &s, err
}
// Decoder is an RDF stream decoder. Statements returned by calls to the
// Unmarshal method have their Terms' UID fields set so that unique terms
// will have unique IDs and so can be used directly in a graph.Multi, or
// in a graph.Graph if all predicate terms are identical. IDs created by
// the decoder all exist within a single namespace and so Terms can be
// uniquely identified by their UID. Term UIDs are based from 1 to allow
// RDF-aware client graphs to assign ID if no ID has been assigned.
type Decoder struct {
scanner *bufio.Scanner
strings store
ids map[string]int64
}
// NewDecoder returns a new Decoder that takes input from r.
func NewDecoder(r io.Reader) *Decoder {
return &Decoder{
scanner: bufio.NewScanner(r),
strings: make(store),
ids: make(map[string]int64),
}
}
// Reset resets the decoder to use the provided io.Reader, retaining
// the existing Term ID mapping.
func (dec *Decoder) Reset(r io.Reader) {
dec.scanner = bufio.NewScanner(r)
dec.strings = make(store)
if dec.ids == nil {
dec.ids = make(map[string]int64)
}
}
// Unmarshal returns the next statement from the input stream.
func (dec *Decoder) Unmarshal() (*Statement, error) {
for dec.scanner.Scan() {
data := bytes.TrimSpace(dec.scanner.Bytes())
if len(data) == 0 || data[0] == '#' {
continue
}
s, err := ParseNQuad(string(data))
if err != nil {
return nil, fmt.Errorf("rdf: failed to parse %q: %w", data, err)
}
if s == nil {
continue
}
s.Subject.Value = dec.strings.intern(s.Subject.Value)
s.Predicate.Value = dec.strings.intern(s.Predicate.Value)
s.Object.Value = dec.strings.intern(s.Object.Value)
s.Subject.UID = dec.idFor(s.Subject.Value)
s.Object.UID = dec.idFor(s.Object.Value)
s.Predicate.UID = dec.idFor(s.Predicate.Value)
if s.Label.Value != "" {
s.Label.Value = dec.strings.intern(s.Label.Value)
s.Label.UID = dec.idFor(s.Label.Value)
}
return s, nil
}
dec.strings = nil
err := dec.scanner.Err()
if err != nil {
return nil, err
}
return nil, io.EOF
}
func (dec *Decoder) idFor(s string) int64 {
id, ok := dec.ids[s]
if ok {
return id
}
id = int64(len(dec.ids)) + 1
dec.ids[s] = id
return id
}
// Terms returns the mapping between terms and graph node IDs constructed
// during decoding the RDF statement stream.
func (dec *Decoder) Terms() map[string]int64 {
return dec.ids
}
// store is a string internment implementation.
type store map[string]string
// intern returns an interned version of the parameter.
func (is store) intern(s string) string {
if s == "" {
return ""
}
if len(s) < 2 || len(s) > 512 {
// Not enough benefit on average with real data.
return s
}
t, ok := is[s]
if ok {
return t
}
is[s] = s
return s
}
func escape(lq, s, rq string) string {
var buf strings.Builder
if lq != "" {
buf.WriteString(lq)
}
for _, r := range s {
var c byte
switch r {
case '\n':
c = 'n'
case '\r':
c = 'r'
case '"', '\\':
c = byte(r)
default:
const hex = "0123456789abcdef"
switch {
case r <= unicode.MaxASCII || strconv.IsPrint(r):
buf.WriteRune(r)
case r > utf8.MaxRune:
r = 0xFFFD
fallthrough
case r < 0x10000:
buf.WriteString("\\u")
for s := 12; s >= 0; s -= 4 {
buf.WriteByte(hex[r>>uint(s)&0xf])
}
default:
buf.WriteString("\\U")
for s := 28; s >= 0; s -= 4 {
buf.WriteByte(hex[r>>uint(s)&0xf])
}
}
continue
}
buf.Write([]byte{'\\', c})
}
if rq != "" {
buf.WriteString(rq)
}
return buf.String()
}
func unEscape(r []rune) string {
var buf strings.Builder
for i := 0; i < len(r); {
switch r[i] {
case '\\':
i++
var c byte
switch r[i] {
case 't':
c = '\t'
case 'b':
c = '\b'
case 'n':
c = '\n'
case 'r':
c = '\r'
case 'f':
c = '\f'
case '"':
c = '"'
case '\\':
c = '\\'
case '\'':
c = '\''
case 'u':
rc, err := strconv.ParseInt(string(r[i+1:i+5]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %w", err))
}
buf.WriteRune(rune(rc))
i += 5
continue
case 'U':
rc, err := strconv.ParseInt(string(r[i+1:i+9]), 16, 32)
if err != nil {
panic(fmt.Errorf("internal parser error: %w", err))
}
buf.WriteRune(rune(rc))
i += 9
continue
}
buf.WriteByte(c)
default:
buf.WriteRune(r[i])
}
i++
}
return buf.String()
}

View File

@@ -0,0 +1,115 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package rdf_test
import (
"fmt"
"log"
"os"
"strings"
"text/tabwriter"
"gonum.org/v1/gonum/graph"
"gonum.org/v1/gonum/graph/encoding"
"gonum.org/v1/gonum/graph/encoding/dot"
"gonum.org/v1/gonum/graph/formats/rdf"
"gonum.org/v1/gonum/graph/multi"
)
// dotNode implements graph.Node and dot.Node to allow the
// RDF term value to be given to the DOT encoder.
type dotNode struct {
rdf.Term
}
func (n dotNode) DOTID() string { return n.Term.Value }
// dotLine implements graph.Line and encoding.Attributer to
// allow the line's RDF term value to be given to the DOT
// encoder and for the nodes to be shimmed to the dotNode
// type.
//
// Because the graph here is directed and we are not performing
// any line reversals, it is safe not to implement the
// ReversedLine method on dotLine; it will never be called.
type dotLine struct {
*rdf.Statement
}
func (l dotLine) From() graph.Node { return dotNode{l.Subject} }
func (l dotLine) To() graph.Node { return dotNode{l.Object} }
func (l dotLine) Attributes() []encoding.Attribute {
return []encoding.Attribute{{Key: "label", Value: l.Predicate.Value}}
}
func Example_graph() {
const statements = `
_:alice <http://xmlns.com/foaf/0.1/knows> _:bob .
_:alice <http://xmlns.com/foaf/0.1/givenName> "Alice" .
_:alice <http://xmlns.com/foaf/0.1/familyName> "Smith" .
_:bob <http://xmlns.com/foaf/0.1/knows> _:alice .
_:bob <http://xmlns.com/foaf/0.1/givenName> "Bob" .
_:bob <http://xmlns.com/foaf/0.1/familyName> "Smith" .
`
// Decode the statement stream and insert the lines into a multigraph.
g := multi.NewDirectedGraph()
dec := rdf.NewDecoder(strings.NewReader(statements))
for {
l, err := dec.Unmarshal()
if err != nil {
break
}
// Wrap the line with a shim type to allow the RDF values
// to be passed to the DOT marshaling routine.
g.SetLine(dotLine{l})
}
// Marshal the graph into DOT.
b, err := dot.MarshalMulti(g, "smiths", "", "\t")
if err != nil {
log.Fatal(err)
}
fmt.Printf("%s\n\n", b)
// Get the ID look-up table.
w := tabwriter.NewWriter(os.Stdout, 0, 4, 1, ' ', 0)
fmt.Fprintln(w, "Term\tID")
for t, id := range dec.Terms() {
fmt.Fprintf(w, "%s\t%d\n", t, id)
}
w.Flush()
// Unordered output:
//
// digraph smiths {
// // Node definitions.
// "_:alice";
// "_:bob";
// "Alice";
// "Smith";
// "Bob";
//
// // Edge definitions.
// "_:alice" -> "_:bob" [label=<http://xmlns.com/foaf/0.1/knows>];
// "_:alice" -> "Alice" [label=<http://xmlns.com/foaf/0.1/givenName>];
// "_:alice" -> "Smith" [label=<http://xmlns.com/foaf/0.1/familyName>];
// "_:bob" -> "_:alice" [label=<http://xmlns.com/foaf/0.1/knows>];
// "_:bob" -> "Smith" [label=<http://xmlns.com/foaf/0.1/familyName>];
// "_:bob" -> "Bob" [label=<http://xmlns.com/foaf/0.1/givenName>];
// }
//
// Term ID
// _:alice 1
// _:bob 2
// <http://xmlns.com/foaf/0.1/knows> 3
// "Alice" 4
// <http://xmlns.com/foaf/0.1/givenName> 5
// "Smith" 6
// <http://xmlns.com/foaf/0.1/familyName> 7
// "Bob" 8
}

View File

@@ -0,0 +1,200 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package rdf_test
import (
"fmt"
"log"
"strings"
"gonum.org/v1/gonum/graph"
"gonum.org/v1/gonum/graph/encoding"
"gonum.org/v1/gonum/graph/encoding/dot"
"gonum.org/v1/gonum/graph/formats/rdf"
"gonum.org/v1/gonum/graph/multi"
)
// foodNode implements graph.Node, dot.Node and encoding.Attributer
// to allow the RDF term value to be given to the DOT encoder.
type foodNode struct {
rdf.Term
}
func (n foodNode) DOTID() string {
text, _, kind, err := n.Term.Parts()
if err != nil {
return fmt.Sprintf("error:%s", n.Term.Value)
}
switch kind {
case rdf.Blank:
return n.Term.Value
case rdf.IRI:
return text
case rdf.Literal:
return fmt.Sprintf("%q", text)
default:
return fmt.Sprintf("invalid:%s", n.Term.Value)
}
}
func (n foodNode) Attributes() []encoding.Attribute {
_, qual, _, err := n.Term.Parts()
if err != nil {
return []encoding.Attribute{{Key: "error", Value: err.Error()}}
}
if qual == "" {
return nil
}
parts := strings.Split(qual, ":")
return []encoding.Attribute{{Key: parts[0], Value: parts[1]}}
}
// foodLine implements graph.Line and encoding.Attributer to
// allow the line's RDF term value to be given to the DOT
// encoder and for the nodes to be shimmed to the foodNode
// type.
//
// It also implements line reversal for the semantics of
// a food web with some taxonomic information.
type foodLine struct {
*rdf.Statement
}
func (l foodLine) From() graph.Node { return foodNode{l.Subject} }
func (l foodLine) To() graph.Node { return foodNode{l.Object} }
func (l foodLine) ReversedLine() graph.Line {
if l.Predicate.Value == "<tax:is>" {
// This should remain unreversed, so return as is.
return l
}
s := *l.Statement
// Reverse the line end points.
s.Subject, s.Object = s.Object, s.Subject
// Invert the semantics of the predicate.
switch s.Predicate.Value {
case "<eco:eats>":
s.Predicate.Value = "<eco:eaten-by>"
case "<eco:eaten-by>":
s.Predicate.Value = "<eco:eats>"
case "<tax:is-a>":
s.Predicate.Value = "<tax:includes>"
case "<tax:includes>":
s.Predicate.Value = "<tax:is-a>"
default:
panic("invalid predicate")
}
// All IDs returned by the RDF parser are positive, so
// sign reverse the edge ID to avoid any collisions.
s.Predicate.UID *= -1
return foodLine{&s}
}
func (l foodLine) Attributes() []encoding.Attribute {
text, _, _, err := l.Predicate.Parts()
if err != nil {
return []encoding.Attribute{{Key: "error", Value: err.Error()}}
}
parts := strings.Split(text, ":")
return []encoding.Attribute{{Key: parts[0], Value: parts[1]}}
}
// expand copies src into dst, adding the reversal of each line if it is
// distinct.
func expand(dst, src *multi.DirectedGraph) {
it := src.Edges()
for it.Next() {
lit := it.Edge().(multi.Edge)
for lit.Next() {
l := lit.Line()
r := l.ReversedLine()
dst.SetLine(l)
if l == r {
continue
}
dst.SetLine(r)
}
}
}
func ExampleStatement_ReversedLine() {
const statements = `
_:wolf <tax:is-a> _:animal .
_:wolf <tax:is> "Wolf"^^<tax:common> .
_:wolf <tax:is> "Canis lupus"^^<tax:binomial> .
_:wolf <eco:eats> _:sheep .
_:sheep <tax:is-a> _:animal .
_:sheep <tax:is> "Sheep"^^<tax:common> .
_:sheep <tax:is> "Ovis aries"^^<tax:binomial> .
_:sheep <eco:eats> _:grass .
_:grass <tax:is-a> _:plant .
_:grass <tax:is> "Grass"^^<tax:common> .
_:grass <tax:is> "Lolium perenne"^^<tax:binomial> .
_:grass <tax:is> "Festuca rubra"^^<tax:binomial> .
_:grass <tax:is> "Poa pratensis"^^<tax:binomial> .
`
// Decode the statement stream and insert the lines into a multigraph.
g := multi.NewDirectedGraph()
dec := rdf.NewDecoder(strings.NewReader(statements))
for {
l, err := dec.Unmarshal()
if err != nil {
break
}
// Wrap the line with a shim type to allow the RDF values
// to be passed to the DOT marshaling routine.
g.SetLine(foodLine{l})
}
h := multi.NewDirectedGraph()
expand(h, g)
// Marshal the graph into DOT.
b, err := dot.MarshalMulti(h, "food web", "", "\t")
if err != nil {
log.Fatal(err)
}
fmt.Printf("%s\n\n", b)
// Output:
//
// digraph "food web" {
// // Node definitions.
// "_:wolf";
// "_:animal";
// "Wolf" [tax=common];
// "Canis lupus" [tax=binomial];
// "_:sheep";
// "Sheep" [tax=common];
// "Ovis aries" [tax=binomial];
// "_:grass";
// "_:plant";
// "Grass" [tax=common];
// "Lolium perenne" [tax=binomial];
// "Festuca rubra" [tax=binomial];
// "Poa pratensis" [tax=binomial];
//
// // Edge definitions.
// "_:wolf" -> "_:animal" [tax="is-a"];
// "_:wolf" -> "Wolf" [tax=is];
// "_:wolf" -> "Canis lupus" [tax=is];
// "_:wolf" -> "_:sheep" [eco=eats];
// "_:animal" -> "_:wolf" [tax=includes];
// "_:animal" -> "_:sheep" [tax=includes];
// "_:sheep" -> "_:wolf" [eco="eaten-by"];
// "_:sheep" -> "_:animal" [tax="is-a"];
// "_:sheep" -> "Sheep" [tax=is];
// "_:sheep" -> "Ovis aries" [tax=is];
// "_:sheep" -> "_:grass" [eco=eats];
// "_:grass" -> "_:sheep" [eco="eaten-by"];
// "_:grass" -> "_:plant" [tax="is-a"];
// "_:grass" -> "Grass" [tax=is];
// "_:grass" -> "Lolium perenne" [tax=is];
// "_:grass" -> "Festuca rubra" [tax=is];
// "_:grass" -> "Poa pratensis" [tax=is];
// "_:plant" -> "_:grass" [tax=includes];
// }
}

View File

@@ -0,0 +1,182 @@
// Copyright ©2020 The Gonum Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package rdf
import (
"archive/tar"
"compress/gzip"
"fmt"
"io"
"os"
"path/filepath"
"reflect"
"strings"
"testing"
)
func TestRDFWorkingGroupSuite(t *testing.T) {
for _, file := range []string{
"ntriple_tests.tar.gz",
"nquad_tests.tar.gz",
} {
suite, err := os.Open(file)
if err != nil {
t.Fatalf("Failed to open test suite in %q: %v", file, err)
}
defer suite.Close()
r, err := gzip.NewReader(suite)
if err != nil {
t.Fatalf("Failed to uncompress test suite in %q: %v", file, err)
}
tr := tar.NewReader(r)
for {
h, err := tr.Next()
if err != nil {
if err == io.EOF {
break
}
t.Fatalf("Unexpected error while reading suite archive: %v", err)
}
h.Name = filepath.Base(h.Name)
if filepath.Ext(h.Name) != ".nt" && filepath.Ext(h.Name) != ".nq" {
continue
}
if _, ok := testSuite[h.Name]; !ok {
t.Errorf("Missing test suite item %q", h.Name)
continue
}
isBad := strings.Contains(h.Name, "bad")
var got []statement
dec := NewDecoder(tr)
for i := 0; ; i++ {
s, err := dec.Unmarshal()
if err == io.EOF {
break
}
gotBad := err != nil
if gotBad != isBad {
t.Errorf("Unexpected error return for test suite item %q, got: %v", h.Name, err)
}
var subj, pred, obj, lab term
if s != nil {
subj.text, subj.qual, subj.kind, _ = s.Subject.Parts()
pred.text, pred.qual, pred.kind, _ = s.Predicate.Parts()
obj.text, obj.qual, obj.kind, _ = s.Object.Parts()
lab.text, lab.qual, lab.kind, _ = s.Label.Parts()
if lab.text == "" {
lab = term{}
}
got = append(got, statement{testSuite[h.Name][i].input, subj, pred, obj, lab})
}
if !gotBad {
_, err = ParseNQuad(s.String())
if err != nil {
t.Errorf("Unexpected error return for valid statement in test suite item %q (%#v) got: %v rendered as\n%[2]s", h.Name, s, err)
}
st, err := termFor(subj.text, subj.qual, subj.kind)
if err != nil {
t.Errorf("Unexpected error return for valid subject in test suite item %q (%#v) got: %v rendered as\n%[2]s", h.Name, s, err)
}
pt, err := termFor(pred.text, pred.qual, pred.kind)
if err != nil {
t.Errorf("Unexpected error return for valid predicate in test suite item %q (%#v) got: %v rendered as\n%[2]s", h.Name, s, err)
}
ot, err := termFor(obj.text, obj.qual, obj.kind)
if err != nil {
t.Errorf("Unexpected error return for valid object in test suite item %q (%#v) got: %v rendered as\n%[2]s", h.Name, s, err)
}
lt, err := termFor(lab.text, lab.qual, lab.kind)
if err != nil {
t.Errorf("Unexpected error return for valid label in test suite item %q (%#v) got: %v rendered as\n%[2]s", h.Name, s, err)
}
// We can't check that we recreate the original from the test suite
// due to escaping, but we can check for a second pass through the
// round-trip.
c := &Statement{Subject: st, Predicate: pt, Object: ot, Label: lt}
pc, err := ParseNQuad(c.String())
if err != nil {
t.Errorf("Unexpected error return for reconstructed statement in test suite item %q (%#v) got: %v rendered as\n%[2]s", h.Name, s, err)
}
if !reflect.DeepEqual(c, pc) {
t.Errorf("Unexpected reconstruction:\norig: %#v\ncons: %#v\nparsed:%#v", s, c, pc)
}
}
}
if !reflect.DeepEqual(testSuite[h.Name], got) {
t.Errorf("Unexpected result for test suite item %q", h.Name)
}
}
}
}
func termFor(text, qual string, kind Kind) (Term, error) {
switch kind {
case Invalid:
return Term{}, nil
case Blank:
return NewBlankTerm(text)
case IRI:
return NewIRITerm(text)
case Literal:
return NewLiteralTerm(text, qual)
default:
panic(fmt.Sprintf("bad test kind=%d", kind))
}
}
var escapeSequenceTests = []struct {
escaped string
unEscaped string
canRoundTrip bool
}{
{escaped: `plain text!`, unEscaped: "plain text!", canRoundTrip: true},
{escaped: `\t`, unEscaped: "\t", canRoundTrip: false},
{escaped: `\b`, unEscaped: "\b", canRoundTrip: false},
{escaped: `\n`, unEscaped: "\n", canRoundTrip: true},
{escaped: `\r`, unEscaped: "\r", canRoundTrip: true},
{escaped: `\f`, unEscaped: "\f", canRoundTrip: false},
{escaped: `\\`, unEscaped: "\\", canRoundTrip: true},
{escaped: `\u0080`, unEscaped: "\u0080", canRoundTrip: true},
{escaped: `\U00000080`, unEscaped: "\u0080", canRoundTrip: false},
{escaped: `\t\b\n\r\f\"'\\`, unEscaped: "\t\b\n\r\f\"'\\", canRoundTrip: false},
{escaped: `\t\u0080`, unEscaped: "\t\u0080", canRoundTrip: false},
{escaped: `\b\U00000080`, unEscaped: "\b\u0080", canRoundTrip: false},
{escaped: `\u0080\n`, unEscaped: "\u0080\n", canRoundTrip: true},
{escaped: `\U00000080\r`, unEscaped: "\u0080\r", canRoundTrip: false},
{escaped: `\u00b7\f\U000000b7`, unEscaped: "·\f·", canRoundTrip: false},
{escaped: `\U000000b7\\\u00b7`, unEscaped: "·\\·", canRoundTrip: false},
{escaped: `\U00010105\\\U00010106`, unEscaped: "\U00010105\\\U00010106", canRoundTrip: true},
}
func TestUnescape(t *testing.T) {
for _, test := range escapeSequenceTests {
got := unEscape([]rune(test.escaped))
if got != test.unEscaped {
t.Errorf("Failed to properly unescape %q, got:%q want:%q", test.escaped, got, test.unEscaped)
}
if test.canRoundTrip {
got = escape("", test.unEscaped, "")
if got != test.escaped {
t.Errorf("Failed to properly escape %q, got:%q want:%q", test.unEscaped, got, test.escaped)
}
got = escape(`"`, test.unEscaped, `"`)
if got != `"`+test.escaped+`"` {
t.Errorf("Failed to properly escape %q quoted, got:%q want:%q", test.unEscaped, got, `"`+test.escaped+`"`)
}
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -19,8 +19,12 @@ type Edge interface {
// To returns the to node of the edge. // To returns the to node of the edge.
To() Node To() Node
// ReversedEdge returns an edge that has // ReversedEdge returns the edge reversal of the receiver
// the end points of the receiver swapped. // if a reversal is valid for the data type.
// When a reversal is valid an edge of the same type as
// the receiver with nodes of the receiver swapped should
// be returned, otherwise the receiver should be returned
// unaltered.
ReversedEdge() Edge ReversedEdge() Edge
} }

View File

@@ -13,8 +13,12 @@ type Line interface {
// To returns the to node of the edge. // To returns the to node of the edge.
To() Node To() Node
// ReversedLine returns a line that has the // ReversedLine returns the edge reversal of the receiver
// end points of the receiver swapped. // if a reversal is valid for the data type.
// When a reversal is valid an edge of the same type as
// the receiver with nodes of the receiver swapped should
// be returned, otherwise the receiver should be returned
// unaltered.
ReversedLine() Line ReversedLine() Line
// ID returns the unique ID for the Line. // ID returns the unique ID for the Line.