Friday, November 16, 2007

SIFA's Boards Set Direction

Columbus, OH was the host city for SIFA's board retreat this year. I have a seat on the Technical Board since I'm the lead of the Data Model Task Force. This post is a summary of the major topics that were covered in the board retreat on November 14 and 15, 2007. Future posts will go into more detail on specific topics. The retreat kicked off on Wednesday, 11/14/2007 with a half-day joint session between the Technical Board and the Executive Board. Staff members presented status updates on various activities of the association.

SIF Association Activities

SIF 2 Certification

One of the major activities includes the forthcoming release of the SIF 2 test harness and certification program. Even though the SIF 2 Implementation Specification was released over a year ago, there have been several challenges surrounding the certification program. Most, if not all, of these challenges have been political versus technical in nature. There is always lag time between the release of a specification like SIF and the implementation of software in the field. However, the delays in the certification program, in my opinion, have definitely slowed down adoption of SIF 2.

National Data Model

Another major activity of the association is its involvement in the National Data Model. Vince Paredes, SIFA's data model architect, is the lead on this work from a SIFA perspective. We saw a preview of a more comprehensive presentation that Vince will deliver at the association-wide meeting in Washington, D.C. in January 2008. Quite a bit of new information was presented about this work, and I found it very interesting that the national data model is being defined using the Ontology Web Language (OWL). My hope would be that this will begin to lead SIF down the semantic road. According to Vince, the initial development of the national PK-12 data model is about 10% complete. He also made it clear that development will be a continual, evolutionary effort; it will likely never be “done.”

Internationalization

The internationalization of SIF was also a major topic. A significant amount of work has been done by SIFA and vendor members to localize the specification for use in other countries. BECTA in the United Kingdom is currently working through a set of pilot SIF implementations. Australia, I believe, is pre-pilot but very interested in using SIF. Other countries have expressed interest as well. The primary component of the SIF specification that must be adapted for international use is the data model. Since SIF was initially developed in the U.S., its data model was patterned on the processes and data in the U.S. educational system. Each country wanting to implement a local version of SIF will need to go through a process to design and implement a data model that fits its educational system. To facilitate this process, the creation of an Internationalization Task Force has been proposed.

Other Activities

Other association-level activities that were discussed include the SIFA/ADL partnership (see my earlier blog post on this), support of the state Longitudinal Data System grant recipients, and marketing and membership. All of these items were introduced during the first joint session between the boards. After lunch on Wednesday, the boards entered individual sessions. The tech board discussed several items but two topics took the bulk of the time.

Major Tech Board Topics

The first major topic that the tech board discussed was the 18 month timeline for specification releases. Decoupling the data model from the infrastructure, which is being driven by several business cases, including internationalization, was the second major topic.

18 Month Timeline

SIFA's current policy is to release new versions of the specification every 6 months. This rapid release cycle is critical to ensuring that SIF can quickly adapt to the needs of end-users. Achieving “out-of-the-box” interoperability and making SIF easier to implement continue to be the two major end-user oriented goals that drive evolution of the specification. Releases during this 18 month time frame will feature new data objects and infrastructure changes that will support achieving these goals. SIF Services will also be released within this window. Work group/task force leads will be planning and prioritizing tasks in further detail over the upcoming weeks and months.

De-Coupling

Since its inception, specification developers envisioned the possibility that SIF would one day need to implement new options for transport. Remember that the SIF specification was created prior to wide adoption of XML/RPC, SOAP and REST. SIF's current transport mechanism is XML over HTTP. However, the current structure of the SIF XSD contains “hard links” between the infrastructure and data model. Therefore, new data models cannot be “plugged” into the current infrastructure without some customization of the infrastructure. Initial investigation by the Infrastructure Work Group and Mark Reichert, SIFA's CTO, seem to indicate that formally de-coupling the data model from the infrastructure will not require a huge level of effort. It is also believed that this formal decoupling will have minimal impact for developers, and perhaps zero impact on end-users.

Volunteers Come Together to Help Schools

SIFA is a volunteer-driven association with a diverse set of member organizations. States, school districts, K-12 software companies, SIF software companies, and systems integrators are the major sub-groups in SIFA's constituency. I'm continually impressed how people from these diverse groups come together and harmonize their interests and knowledge within the specification development process.

Friday, November 9, 2007

SIFA and ADL Partner

Interoperable Learning Content Vision
Consider all of the educational content that society has developed. Courses on world religions, lesson plans for high school geometry, corporate training materials on risk management, military training programs on surface to air missile operations. An immense body of educational content is "out there." Yet what untold and countless number of hours are spent every day recreating educational content that already exists?
The vision of interopable learning content is to provide both visibility and accessibility to the widest possible body of digital instructional materials. Interopable learning content technologies allow teachers to systematically find, obtain,license (if necessary), customize, and incorporate relevant materials into instruction. Fulfilling this vision is one of the missions of the SIFA-ADL partnership.
SIFA Partners with ADL on Interoperable Learning Content
The SIF Association is holding regional meetings in the United States to discuss some of the details of its partnership with ADL (Advanced Distributed Learning; http://www.adlnet.gov). The focus of the partnership between the two organizations is on interoperable learning content. I attended the meeting hosted by Chicago Public Schools on November 8, 2007. Organizations that had a presence at the meeting included Chicago Public Schools, Pearson, Follett, Integrity Technology Solutions, Educational Systemics and Plato Learning. Jill Abbott, SIFA's Learning Strategist, and Paul Jesukiewicz, ADL's Deputy Director, lead the meeting. This article is a brief summary of what was presented at the meeting.
About ADL
ADL is an initiative sponsored by the US Department of Defense. The original concept behind ADL was to create a standard to facilitate the sharing of learning content for the US government. The result of this work has since moved into the mainstream and been adopted by higher education, industry, and international concerns.
ADL's Work Product: SCORM
SCORM is the standard, or reference model, defined by ADL that specifies how learning objects may be packaged and exchanged. The scope of an individual learning object may range from all of the content needed for a semester long course, down to a single course exercise. A SCORM shareable content object consists of two parts: a manifest and instructional content payload.
The manifest contains metadata about the instructional content payload that tags the content and describes its organization. The manifest can also indicate sequence and branching among members of the content payload. For example, based on the results of a student's formative assessment, a learning management system could potentially select a different sequence through the content. The sequencing component of the manifest facilitates dynamic adaptation to an individual student's needs.
Learning management systems are the primary consumers of shareable content objects. SCORM compliant content can come from a variety of sources ranging from content publishers to individual teachers. Various tools exist to enable the development of shareable content objects. The authoring subsystems of learning management systems can generally export shareable content objects. Standalone tools exist that allow authors to package content developed in other formats (e.g. office products, HTML) into shareable content objects.
Direction of ADL/SCORM
One of ADL's stated objectives is to divest itself of the stewardship and future development of SCORM. A new body called LETSI (Learning-Education-Training Systems Interoperability) will take this over. LETSI will be an international, lightweight organization that will govern SCORM development going forward. LETSI will build on the current version of SCORM to develop a common framework, known as Core SCORM, that will be extensible to meet the needs of diverse industries and groups. Core SCORM will define the fundamentals of shareable content objects, most likely including packaging, basic metadata, and sequencing.
Status of the Partnership
The SIFA-ADL partnership has been established, but the work of integrating SIF and SCORM is in its formative stages. The two organizations are soliciting use cases from vendors and end users, with an emphasis on how schools want to use the technology. This communities of practice pattern will be intrinsic to the LETSI/Core SCORM development process. Like SIFA and its contingent of school-affiliated organizations, other organizations, industries and groups will build on Core SCORM to meet community-specific use cases.
SIF and Educational Content

SIF is traditionally known as a way to model and move administrative and demographic data among operational information systems in schools. Those involved closely with SIF (including myself) are convinced of its overall, positive impact on school operations and data, including data quality and availability. With a widely implemented transport infrastructure and data model, SIFA is building capacity that will more directly impact teaching and learning within schools. The alliance with ADL will result in the eventual integration of Core SCORM and SIF, which will advance SIFA's teaching and learning initiatives and further establish SCORM as a global platform for interoperable learning content.
Impact on Online Learning and Personalized Learning
Although compelling from the standpoint of sharing and efficiency, the vision stated at the outset of this article does not give full justice to the potential of interoperable learning content. Online learning and personalized learning are two major beneficiaries of interoperable learning content.
Online learning continues to proliferate. Students are clearly expressing the preference to learn via the technology that is already intrinsic to their lives. In addition to becoming the learning method of choice for students, virtual schools fill critical gaps in traditional educational systems. Virtual schools can help students maximize their potential by serving as an alternative form of instruction. Interoperable learning content has the potential to remove the burden of content development from virtual schools by providing access to a world-wide network of instructional material.
Personalized learning takes the benefits of online learning to a new level. By implementing a continuous feedback loop between teaching and learning, custom "virtual" curricula can be developed to dynamically address the needs every student as an individual. Each student can learn to her or his maximum potential.

Thursday, November 1, 2007

The Data Warehousing Applications of SIF

In SIF, the concept of interoperability has traditionally been applied to applications sharing data with each other. Each application in a Zone has the ability to publish data for which it is authoritative. Other applications in the Zone subscribe to that data as needed. For example, some student information systems subscribe to food services and transportation data. Almost all SIF-enabled applications have the ability to subscribe to the student demographic data supplied by the SIS.

As data warehouse solutions become more prevalent in schools it is worth discussing how they can interoperate with SIF. One of the most labor intensive components involved in implementing and supporting a data warehouse are the extract/transform/load (ETL) processes. ETL processes are responsible for mining data from operational systems (the “E”), modifying and scrubbing the incoming relational data to fit a dimensional model (the “T”), and then placing that data into the dimensional model or a staging area (the “L”). Whether the data warehouse has been built from scratch using industry standard tools or it is a commercial product, significant challenges lie in getting intimate with the native data structures of the operational systems. Over time, operational systems schemas change. This can add a potentially significant support burden to the data warehouse, just for the ETL component!

SIF has the potential to ease the ETL burden for data warehouse initiatives. First, SIF provides a standard data model that most leading vendors in the K12 space have adopted. In other words, the SIF Agents for those systems conform to a standard definition for data groups like student, staff, assessment, special programs, and finance. The SIF data model can thus be leveraged for ETL staging. Just be aware that SIF does not (yet?) provide any standards for dimensional data models.

Second, SIF provides a standard transport method for mining data out of the operational systems using XML over HTTP/S. Traditional approaches to ETL involve using ODBC, JDBC, file dumps, and other one-off approaches for accessing operational systems’ data. SIF’s Request/Response and Event models provide a much simpler way to load and maintain synchronization of operational data in a data warehousing environment. A “universal subscriber” Agent that stores incoming SIF data in its native XML representation or that “shreds” the XML into a relational staging area can be implemented for this purpose. The major advantage of the “universal subscriber” approach is that it essentially implements one interface to a Zone, versus implementing an interface per operational system as would be necessary with traditional ETL tools.

Another caveat that must be addressed if you consider using SIF for ETL into a data warehouse is data quality. SIF provides a powerful, open mechanism for moving data. However, one must never forget the GIGO principal (garbage in, garbage out). As applied to SIF, GIGO says that if bad data is entered into an operational system, then bad data will be in ALL subscribing operational systems AND the data warehouse. How can this challenge be addressed?

I believe it starts with business processes. Organizational standards for data entry must first be defined. The people responsible for data entry must then be trained on those processes. An even better approach is to have them participate in development of the standards in the first place. There must also be systematic checks of data before it is loaded into its final destination, the dimensional model. The data quality checks can be implemented at various points in the process. The previously mentioned “universal subscriber” could be made responsible for some of the necessary data validation. The process that loads the staged data into the dimensional model would also likely share in the responsibility of checking data quality.

SIF can be a great tool for inter-application data sharing. It can also play an important role in data warehousing initiatives by streamlining and simplifying ETL via a standard data model and data transport.