XF-Source: artwar02 X-RDate: Thu, 27 Jan 2000 17:02:43 -0500 (EST) X-UIDL: 949010113.13528.boom.artware.qc.ca Return-Path: Delivered-To: artware-qc-ca-liste@artware.qc.ca Received: (qmail 13526 invoked from network); 27 Jan 2000 21:55:12 -0000 Received: from listserv.activestate.com (199.60.48.6) by boomboom.artware.qc.ca with SMTP; 27 Jan 2000 21:55:12 -0000 Return-Path: Received: from mail1.averycounty.net ([216.88.22.11]) by listserv.activestate.com with SMTP (Lyris Server version 3.0); Thu, 27 Jan 2000 13:05:20 -0800 Received: from mahadev (ascend6-3-125.boone.net [216.1.3.125]) by mail1.averycounty.net (Pro-8.9.3/Pro-8.9.3) with SMTP id QAA26363 for ; Thu, 27 Jan 2000 16:03:29 -0500 Message-Id: <3.0.3.32.20000127160501.035b7d98@mail.boone.net> ReplyTo: rpero@boone.net X-Sender: rpero@mail.boone.net Date: Thu, 27 Jan 2000 16:05:01 -0500 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" List-Unsubscribe: List-Software: Lyris Server version 3.0 List-Subscribe: List-Owner: X-List-Host: ActiveState Reply-To: Ron Pero X-Lyris-Message-Id: Precedence: bulk X-listname: perl-xml X-ListMember: [liste@artware.qc.ca] XFMstatus: 0000 Sender: bounce-perl-xml-111098@lyris.ActiveState.com From: Ron Pero To: Perl-XML Mailing List Subject: [sml-dev] Data Model Poll FAQ For the past month+ there has been a very interesting discussion list about creating a simple subset of XML tentatively called Simple Markup Language. CXML, Common XML, has also been an important topic. I just want to be sure that the brain-power at this list is aware of the SML/CXML endeavor. Perl tools will be useful, and the perspectives of the people here would also help. The below summary was posted today, and can spare you from rummaging through the 2000 messages created to-date. Regards, Ron >From: "Oren Ben-Kiki" >To: >Date: Thu, 27 Jan 2000 21:06:17 +0200 >List-Help: , > >List-Unsubscribe: >List-Archive: >Reply-To: sml-dev@egroups.com >Subject: [sml-dev] Data Model Poll FAQ > >I'll try to summarize the differences between Joe's model and my NVC one, >and the main reasons for each design decisions, so that people won't have to >wade through the last 400 messages to catch up. I hope the following is a >fair summary... > >--- THE SML DATA MODEL POLL FAQ --- > >- What is "SML" and what is "CXML"? > >"CXML" is Common-XML. It is a subset of XML which is guaranteed to work in >every XML system. It contains element, attributes, mixed content, some >predefined and numeric character entities, only UTF-8/UTF-16 character >encoding, and that's it. You can send a CXML file from Israel to Japan via a >French system and know it will work as expected. > >CXML is strongly linked to the XML tool set. In particular, it is expected >that CXML tools will mostly be built on top of SAX, DOM, etc., but avoiding >the complexities and subtleties required by the full XML specs. > >"SML" is "Simple-Marker-Language". It is provides just elements. SML is an >attempt to provide a simple, uniform syntax, model and API which would avoid >the XML complexities and which would serve as a better platform for >developing *ML applications, especially (but not only) B2B ones. > >The rest of the issues relate just to SML... > >- What is "No mixed content"? > >"No mixed content", that means that an element may contain either >sub-elements _or_ text, but never both. For example, b is mixed >content. > >- What are the consequences of "No mixed content"? > >"No mixed content" means all white space between an element and its >sub-elements is ignored, but white space inside a leaf element is not. > >So " >/> > >" parses to the same data as "". But " " and >" " parse to different data (the value of the "a" element is >different). > >This allows a parser to be very simple and efficient, and pass the >application >populate the tree (or create events for, or whatever) without cluttering it >with the ignored white space. This is _not possible_ if the syntax allowed >mixed content. > >This also simple data models which don't suffer from tricky semantic >problems while they are being edited. > >This is fine for B2B, and is not good for people marking up text. The >premise is that people marking up text are (should be) using an SML-aware >editor, or are using SGML (XML at best). Either way, if you are a >mixed-content fan, then SML isn't for you. Go for CXML. > >- What's SML data model? > >Depends if you use Joe's or the NVC (Oren's) model. > >Both models agree that a node has a name, a value string field, and a node >children vector. In Joe's model, a node has either a value or children but >never both, just like an SML element. In the NVC model, a node may have both >a value and children. > >- Is there an SML API? > >There is a proposed API for NVC (in a very initial state), described in >message 1660. There was no attempt as yet to define a similarly detailed API >proposal for Joe's model. > >- How does this effect the syntax? > >Joe's model has a trivial 1-1 mapping to the elements-only SML syntax. > >The NVC model has a problem in mapping a node with both a value and children >to SML syntax. It could be possible to design a new syntax which has >intuitive 1-1 mapping with the NVC model, but that wouldn't be XML. > >Since XML compatibility is deemed more important then a 1-1 mapping with the >syntax, the NVC model solves the problem by giving special semantics to the ><_> element. Note that the <_>... element is a perfectly legal XML >element which will work in every XML system. Also, no known XML language >uses it. > >In the NVC model, <_> must contain text. It carries special meaning to the >parser. It says "do not create a new node for this element. Instead take the >following text and use it as the value of the current node". This means that >there can be only one <_> element, that it must contain only text, that it >must be the first sub-element (for efficient streaming). > >In both models, every SML file is also a valid XML file. > >In Joe's model, every XML file consisting only of elements is a valid SML >file. In the NVC model, in theory XML files using the <_> element in various >ways wouldn't be a valid SML file. In practice this isn't an issue since no >known XML language uses <_>. > >The model created for a given SML file is the same for both models, as long >as the <_> element is not specified. Note that this means that parsing any >file in an SML-compatible XML language such as XML-RPC will yield the same >data for both models. > >- Why bother with the NVC model? > >It turns out that allowing a node to have both a value and children makes it >easier to model certain information and write programs in a certain way >(basically, as a set of weakly-interacting modules, using the "modification >by addition" coding strategy). > >Samples of what can be easily achieved in this model and are hard to do in >Joe's model are writing programs which are "syntax independent" to a large >degree, using a "coloring" scheme; customizing the push/pull behavior of a >streaming application; allowing for more forms of backward-compatible >"document schema evolution". > >This is useful for generic, open-ended SML tools. It is less useful for >specific, task-oriented SML tools. For the latter, <_> might seem like an >arbitrary complication of Joe's model. For the former, <_> might seem like a >necessary device to overcome the weakness of SML syntax. > >- What about namespaces? > >Neither model explicitly addresses this issue. Both implicitly assume that a >node name is "unique" by using unique prefixes for all names. Since both >models treat names in the same manner, any solution would probably be >applicable to both. > >- Is there more information? > >The discussions in the last month or so have covered a lot of ground, and >there were many question-and-answer threads between myself and various >people about the NVC model. Feel free to raise any _additional_ questions >:-) > >In addition, the following links are a "must read": > >Joe's bare-bones syntax: > >http://www.egroups.com/docvault/sml-dev/Work/Syntaxes/Barebones.html > >I'm not certain where the model definition is. Could someone upload one? > >Specifying NVC directly at the syntax level: > >http://www.egroups.com/docvault/sml-dev/Work/Syntaxes/NameValueChildren.html > >It is also possible to use the bare-bones syntax and express NVC as a >restriction of it, instead. The NVC model is: > >http://www.egroups.com/docvault/sml-dev/Work/Models/1660_NameValueChildren.h >tml > >And a proposed API for it is in: > >http://www.egroups.com/group/sml-dev/1660.html > >Given any additional spare time :-), the discussion in the last month or so >has covered most issues in reasonable detail. The other model and syntax >papers are also worth a look. > >- What next? > >Having read the above, make up your mind, and vote for your favorite >approach in: > >http://www.egroups.com/vote?id=948930623686&listname=sml-dev > >Have fun, > > Oren Ben-Kiki > > >------------------------------------------------------------------------ >post: sml-dev@eGroups.com >unsubscribe: sml-dev-unsubscribe@eGroups.com >info: http://www.egroups.com/group/sml-dev/info.html > >------------------------------------------------------------------------ --- You are currently subscribed to perl-xml as: [liste@artware.qc.ca] To unsubscribe, forward this message to leave-perl-xml-111098H@lyris.ActiveState.com For non-automated Mailing List support, send email to ListHelp@ActiveState.com