Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending SPLAC to include ADDR and PLAC? #536

Open
mother10 opened this issue Aug 16, 2024 · 56 comments
Open

Extending SPLAC to include ADDR and PLAC? #536

mother10 opened this issue Aug 16, 2024 · 56 comments

Comments

@mother10
Copy link

Since I started reading GEDCOM, I have always wondered why on some places in the GEDCOM, ADDR had to be used, and on other places PLAC.
Inside ADDR we have tags for CITY, STAE, POST and CTRY.
But inside PLAC we have jurisdictions that do the same?!
On places in GEDCOM, where only PLAC is allowed now, people start using an address as the leftmost, smallest, jurisdiction, because they want to denote an address rather then a Place.
Or they use the name of a church as the leftmost jurisdiction.

Now I have seen the proposels for SPLAC (See #520 and #527 ) of which I want to add to 520 "Adding SPLAC beside PLAC", because that is more in line with what I will write here.
The comparison in that proposel, was made with NOTE and SNOTE.
But I think that should be with REPO and SOUR.
Why?

1 (one) Repository can contain many sources. At an event, we can link to a Source, which can link to a Repository.

I wonder if addresses are not the same.

1 (one) City, can have many addresses. So at an event we should have just an address described, and that address should point to a City (SPLAC). (which in turn points to, a state, which ... etc just as in the new spec of GEDCOM)
The address at the event should NOT have CITY, STAE, POST and CTRY.
Because that way we have kind of the same information on more places in the GEDCOM.

And to maybe make things clearer, we should not call it ADDR, but maybe BUILDING (or an abbreviation of that like BLDNG)
BUILDING can be someones home, or a church or a castle, or a university, a Harbour etc.

Because in 1 home, more children can be born, and in 1 church many children can be baptised, or people can have their wedding, BUILDING should also be a record like structure, not a property of something.

In fact, BUILDING is the smallest form of SPLAC.
It should have a name or a title so it can be identified for a user.

So where ever it is now allowed to have either ADDR or PLAC, we will now have BUILDING there.
That just describes an address in user understandable text. And points to the CITY SPLAC entity.
I looked at addresses around the world, and that seems way to complicated to "catch" in a specification.
So thats why I say, inside the BUILDING, the address is written as complete as the user wants, but that text will only be output by a program (shown to the user), NOT interpreted, it is not used to define where it is on a map, because of the complications.
Defining a BUILDING on a map is done by the link to the "first" SPLAC in the chain.

Now because there can be more BUILDINGs in a City, inside BUILDING we can also use MAP with its 2 coördinates to pinpoint its position in a City.
If that position is not present, it will be put at the center of the city it belongs to (the default position of the City itself). On top of others that have no further positioning.

By adding MAP to BUILDING, it will now also be possible to denote the birthplace when a child is born on a ship or in a plane, or in a car somewhere, because a hospital could not be reached in time. Same is valid for a Death at sea.
So instead of dying in the "Atlantic Ocean", which is so huge we have no idea where that might have been happening, we maybe are able to figure out from the route a ship took, and the date of the death, where it might have been and show a bit better in that immense ocean where it happened.
So BUILDING could also be a Plane or a Boat.

To me I think BUILDING should also have a NOTE structure and a SOURCE citation.

And SPLAC would have a CHAN and a CREA.

Area's:
I remember, to have seen, I think Luther, talking about people wanting an area to denote an event.
That is also possible, we define RADIUS under MAP, and give meters, or kilometers ar anything, and we have a circle with a center, denoting the area where things took place.
1 MAP
2 LATI N18.150944
2 LONG E168.150944
2 RADIUS 5.4KM

Instead of RADIUS we could also have SQUARE or RECT like:
1 MAP
2 LATI N18.150944
2 LONG E168.150944
2 RECT X:+5.7KM Y:-5.7KM
Depending on the + or - the Coordinates denote which corner it is. (Both +, the corner is the left bottom, both minus, that corner is the top right)

But to me RADIUS seems easier.

My guess is this will also reduce GEDCOM size, as a lot of "doubled text" (from all the ADDR's that are in fact the same) will be removed and move into the corresponding SPLAC records.

If BUILDING has a MAP structure and SPLAC too, the MAP of BUILDING should be used to show on a map, as thats more precise. The MAP of a City points to an arbitrary point inside the City. A default point, in case the BUILDINGs pointing to that City, have no Map structure.

In the SPLAC beside PLAC md file, I think there is an SPLAC record missing under RECORD :=
For the TYPE I would choose to have everything in Uppercase, so CITY, COUNTY, STATE, COUNTRY
I think it might be more clear if there also was an example with a real placename.

Maybe, in case of the above example for an Ocean, have a TYPE OCEAN too?
And maybe a Type AIR?
These last 2 Types have no other SPLACs that they point to, or that point to those I presume.

Some other thing about SPLAC:
Now it has coördinates. But what if there would be more ways of denoting a place then just LAT and LONG and an Address.
Then it might need a TYPE to tell what mapping system is used. And maybe more then 1 mapping system can be used for 1 SPLAC?
There seem to be other systems like What3words, UTM coding, Plus Codes, and maybe more.
But that does not seem very common yet?

I am sure I forgot things, but I wanted to add it here, to maybe inspire someone.

@Norwegian-Sardines
Copy link

In 40 plus years of using GEDCOM I've never used the Address_Structure! As far as I'm concerned it is useless! But to each their own!

If I need to include an address for a place it is just another entry in the PLAC text. Just like if I need to add a grave marker (because I know the exact location of that marker) I put either the word "Grave" or if I know whatever the cemetery uses to locate the grave (i.e. row, section, columbarium number, "At Sea", etc.)

As far as items that are bigger than a point (Y/X coordinates) we should also have a polygon option in a "PLAC" node.

I also want a TEXT tag that can use a markup/markdown language option to create a formated description of the place being saved!

If I was part of the SPLAC committee these are things I would add, but I'm not!!

@dthaler
Copy link
Collaborator

dthaler commented Aug 20, 2024

Discussion in GEDCOM Steering Committee 20 AUG 2024:

  • Addresses were first added to GEDCOM for submitters.
  • Some postal addresses are not "places" per se, such as "PO Box 1234" or "IRS, Kansas City"
  • It is common today for many places to have an address, but in historical records it is much more uncommon for places to have a postal address. And agreed that some places have no address like an event that someone died at sea in the Atlantic.
  • We are looking at having an open zoom meeting for SPLAC discussion, since we want input from the community before making any decisions. We appreciate this discussion in this and related issues. The proposed date is September 10th, morning in US, late afternoon/early evening in Europe. Watch for a posting in the discussions section of this repository.

@tychonievich
Copy link
Collaborator

A somewhat related topic: in my opinion, ADDR is more like EXID than it is like PLAC: it's an identifier, usually defined or accepted by some kind of postal system, for a receptical or recipient of letters and packages. I think that this is consistent with the specification for ADDRESS_STRUCTURE's wording "as it would appear on a mailing label", but could be more clear (or could be an incorrect reading). Like @Norwegian-Sardines I rarely use it and would be happy to use detailed PLAC instead as @mother10 proposes, but I also think that there are cases where a postal address identifier is useful (identifying SUBM, REPO, and postal addresses that appear in sources such the address of a lawyer associated with a legal document)

@mother10
Copy link
Author

In my tree, since around 1850 or so, there were a kind of "housecards" used. Those cards were for 1 specific house in some suburb or street (depending on the way they did addressing) On those cards is a lot of important info, names, relations of people living there, dates of birth, when they arrived and left and where they came from or went to. So yes I have addresses in my tree, but only from that time and later. So there should be some way of expressing that. But like Luther said, thats more an EXID thing, street (suburb) and housenumber. All the rest can go in SPLAC's.

Further, as I already mentioned, in the forums of the program I use, people asked to have the leftmost of the PLAC jurisdictions to be an address. Probably for the same reason.

So a postal address is not just for today, for lawyers and such.

@Norwegian-Sardines
Copy link

Tineke,

If I’m reading you correctly, these “housecards” sound like Source_Records and therefore assert names, relations, birth dates, residence and other “facts”.

Just like a census that also asserts facts, I would create a Source_Record for the census (i.e. 1920 US Census) with a date of the enumeration and associate it with all of the facts it asserts including the Residence, I would do the same with your “housecard”. In both cases I add the house and street number to the PLAC tag.

In my application we have extended the application to have additional information about each item in the comma delimited PLAC payload that provides detail about the item/location including images and text/history. Meaning I can view information about the division, address, country etc. of any place in my database!

@mother10
Copy link
Author

I understand what you say.
But in the Dutch application my data comes from, those are all addresses. Thye are added as "facts" to a person, showing where that person lived with a from date and an end date.
So many Dutch people will have it like this as that program is the most widely used program in our country.

But yes, also a SOUR.

@albertemmerich
Copy link
Collaborator

@Norwegian-Sardines:
Let me show why I am using both PLAC and ADDR, even if I have the building in the PLAC hierarchy:

2 PLAC myhouse, myvillage, mycounty, mystate, mycountry
3 FORM building, city, county, state, country
2 ADDR mystreet myhousenumber
3 CONT zip mymunicipality

The PLAC hierarchy is NOT the same as the ADDR hierarchy. I cannot put the ADDR as part of PLAC, as PLAC has no zip code, and as PLAC has the jurisdiction "city", which the address does not have, and municipality is used by the postal address, but not part of PLAC. This is not only theory, in my own postal address you will not find the city, but the municipality. And in the PLAC I do not show the municipality, but the city..

myhouse can be a farm name, or something like that. It refers to the same location as the address does, however the address is like an EXID (see Luther's comment), which - to make it more complicated, may vary from time to time. Some hundreds years ago little villages had no street names, only house numbers. Later the street names were introduced, together with new house numbers. Then the postal code was introduced, however that system has been modified twice since then.

So if we add ADDR to the SPLAC records, we will have an own substructure for the address in the SPLAC record. And that substructure must have a DATE substructure to show the time dependency.

@mother10
Copy link
Author

O and the extended information you mention, will now go in the different SPLAC's, that is if they chooses SPLAC not inside PLAC.

@Norwegian-Sardines
Copy link

but I also think that there are cases where a postal address identifier is useful (identifying SUBM, REPO, and postal addresses that appear in sources such the address of a lawyer associated with a legal document)

In v5.5.1 GEDCOM, without enough fields to capture enough information about an “artifact source” what I’ve done for documents that were created (like a legal document), the author (AUTH) is the lawyer/writer/scribe of the document and any “location based” information would follow the suggestion for unpublished work suggesting that we use the Source_Publication_Facts payload and input the city, state of the writer of the document!

Maybe Source_Record needs a template for other data just like I proposed from creating a Citation_Record!

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Aug 20, 2024

Albert,

The PLAC hierarchy is NOT the same as the ADDR hierarchy. I cannot put the ADDR as part of PLAC, as PLAC has no zip code, and as PLAC has the jurisdiction "city", which the address does not have, and municipality is used by the postal address, but not part of PLAC. This is not only theory, in my own postal address you will not find the city, but the municipality. And in the PLAC I do not show the municipality, but the city..

I don’t understand the difference between a city and a municipality. In Norway, people who live on a farm live in a Kommune not a city, I put the Kommune name rather than a city in the PLAC hierarchy. In the US we have rural townships, I use these when locating a farm, because city does not make sense!

I agree that you need to have an address and zip code, but what are addresses and zipcodes for? Mailing letters! Personally I think GEDCOM should not be used to create mailing labels, but to each their own!

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Aug 20, 2024

On the other hand, if you did want to use GEDCOM to maintain an address book, I think putting the Address_Structure inside the Place_Structure is incorrect! I’d rather see it used as a stand alone Fact with a date range rather than as a subtag of all facts, having the zip code and a mailing label address layout for facts like OCCU and CENS makes little sense, most likely you are not going to send a letter (using zip code) to these places!

@albertemmerich
Copy link
Collaborator

In my region (Germany) we have:
building
city (the name of village/city where you live)
municipality (the lowest administration level)
county (next administration level)
district (administration level, only in some states)
state
country

In some parts of Germany we have another administration level in between municipality and county.

By the given hierarchy a city is administrated by a municipality.

The postal address in most cases is build using the municipality, not the city (sometimes the names of these levels are the same, then you do not see the difference). And the postal address is build by using street name and house number, not the farm name. The farm name is part of PLAC, a different street name with house number (pointing to this farm) is not part of PLAC.

What are the addresses for? To store this information as it is given by several sources. Like civil birth and death records, books with all addresses of people living in a city in a defined year (like census), letters written from sender and his address to receiver and his address, contracts of two parties identified by name and address. Often the address is a very important information to identify the individual: Someone with same name is living in the same village at the same time. Zip codes are very important to identify municipalities (we have a lot of the them with same name, however in different counties or states).

If you do not document the addresses, you do not need the ADDR. Your decision. If you do document, you need it. And GEDCOM is to transfer all data we collect in our genealogical research.

@Norwegian-Sardines
Copy link

I can understand your requirement to record the mailing address in these cases although I would only add it as a NOTE for the fact rather than as a Formatted Mailing Label, because the formatting has less value!

Currently, the Place_Structure and the Address_Structure are not related to each other.
Event_Detail

This means that we can record an address {full formatted address as it would appear on a mailing label, including appropriate line breaks (encoded using CONT)} related to the fact that is structurally different than (and independent of) the Place_Structure. This would solve/work for your intended need, recording a mailing address that is not the the same as the location (PLAC) of the "fact".

Adding the mailing address for all levels of the SPLAC binds that mailing address to that SPLAC instance, and for all uses of that SPLAC instances and would require a date driven Address_Structure as well if the "location" stayed the same but some element of the Address_Structure changed.

I would rather see the Place_Structure, Address_Structure and the Shared_Place_Structure all be separate data elements not dependent on each other

@mother10
Copy link
Author

Hi all,
Have been reading your comments and thought about them.
I would like to step back and see what we are doing here. (For the "we" read in fact you all)
We started with a PLAC structure and that was cut into separate individual pieces, those pieces could be linked together in a defined way, to form the original PLAC.
Now the cutting gave in fact Jurisdictions, so why were they called SPLAC? Because they came from PLAC?
They are new, not yet existing structures for GEDCOM, so as they are in fact jurisdictions, why not call those JURIS structures then?
To people using GEDCOM that might sound way more logical because thats in fact what they are.
By naming them SPLAC, that sounds like "place" it gives confusion. They are no places in itself, only when together they form a complete string, they form a PLAC.

Chosing the right name for something can help tremendously in understanding new things.
As I said in the starting post, they work like REPO and SOUR, only this time they are not just two steps, they form a whole "staircase".
The bottom step is the position where an event happened, the topmost step is the country where it happened.
To have a correct "staircase" we need a topmost step, 1 or more steps in between, in sequence of area size (as jurisdictions are), and a bottom step.
The bottom step should be the most accurate position we can have for an event.
Often that can be a home, but it can also be a farm or a commune or other things as I mentioned before, and as others stated here. And maybe we only have a placename.

Should the topmost step be "WORLD" because then we can link events on the Atlantic and such, directly to JURIS "WORLD" and we would also have a correct PLAC according to the definition.

Now the "mailing" Addresses. They give confusion, thats why I started by saying we should leave out CITY, STAE, POST and CTRY.
They should be, like Luther said, EXID kind of things. Their contents should be the responsability of the user.
Why?
Because there are so many ways addresses can be written. And its almost imposiible for a program to construct a correct one.
So in case an address is needed for a contract of a lawyer and such, that should go, by the user, in an address, where the address is more like a NOTE with more lines of text, controlled by the user. (as Norwegian-Sardines said)
That address could be used for a letter or parcel to send to, but I dont think it should be the responsability of GEDCOM to check if its a correct address. GEDCOM should treat that as a NOTE.

The other type of address, as I mentioned on the "housecards", they have a street and a housenumber, or, as Albert said, in older times they were just housenumbers in a small village. Maybe we should better call those "Positions".
These addresses/positions form the lowest "step" of the JURIS staircase.

They are not used to send a letter, only to position an event on a map. (Hence I said they all might have a MAP structure)

It can very well be we cannot really locate those "address"-JURIS's (anymore) because we have no idea where that street and housenumber might have been. They could be destroyed for some reason (flood, bombing, anything) or might have disappeared because new houses were build there.
But no matter what, those smallest JURIS's (Positions) are the bottom step in our PLACE staircase.

Maybe we dont know the smallest JURIS, and we only know the City, then that is the lowest step of the staircase.

As @Norwegian-Sardines and others said, the way of addressing a "position" varies much accross countries and even inside one country. The sequence of the steps is not the same everywhere.
So the JURIS's should have a TYPE, same as the jurisdiction names in the PLAC statement has now.
Extra TYPE's are needed because of the variations in the countries for jurisdictions. (Like Municipality, CODE-Insee and more.)
In 1 "staircase" each TYPE is only allowed once. The sequence of the TYPE's is the users responsability.

JURIS should have the possibility (as Norwegian-Sardines mentioned) to store extra information to describe a city or a state or more. Adding pictures should also be possible.
If we define something new, lets think carefully about what should go in there, so we dont have to add forgotten things too soon after the release.

I would like to add 1 extra TAG here.
As the PLAC is now, with jurisdictions that can be empty, that gives users sometimes placenames like:
, Den Helder, ,
So with "empty" comma's.
Now the same as the proposal for the new NAME structure has a TYPE "RUFNAME" to denote which name should be underlined, I would like to propose a TAG CITYNAME (or alike) to denote which of the JURIS tag names should be presented to the user.
This tag is only allowed in 1 (one) of the steps in the JURIS staircase.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Aug 21, 2024

I'm still going through all of what you wrote, so I'll answer smaller bits as I find them!

Now the cutting gave in fact Jurisdictions, so why were they called SPLAC? Because they came from PLAC?
They are new, not yet existing structures for GEDCOM, so as they are in fact jurisdictions, why not call those JURIS structures then?
To people using GEDCOM that might sound way more logical because thats in fact what they are.
By naming them SPLAC, that sounds like "place" it gives confusion. They are no places in itself, only when together they form a complete string, they form a PLAC.

For me (and maybe this is just my interpretation) a "Place" does not stop with "jurisdictions"! If I have a BURI fact I also want to have a place to record the location of the exact grave marker. I the software I use we can map to a specific spot on the map include the exact location of the grave. The same is true for RESI (residence) I want to include the street address or farm name in the PLAC tag and map that location. This is why I've used the SPLAC tag, it is a "Shared_Place".

In addition, a place can have multiple parent places. Let's take an example: The city of Gdansk in what is not Poland was for a time part of the Prussian Empire and before that was officially in the "Kingdom of Poland". Individuals born in various parts of Poland during times of annexation and control could have birth certificates stating Prussia. The hierarchy for Gdansk should include all possible parent jurisdictions with From/To dates. My own grandfather was born in "The Hungarian Empire" the town has changed names but it is now in Serbia! We need alternate names with date-range (from/to)!

One other reason I want to have separate SPLAC nodes is so I can associate formatted text with the node to describe the place for my readers who are not "history people" but still may be interested in the history of the place, be that place a farm, city, country.

We could name SPLAC something else but my vision is to deprecate PLAC at some point.

@albertemmerich
Copy link
Collaborator

Hi Tinneke,
JURIS would not meet the real sense of places, as they include buildings, cemeteries, churches, villages. All these objects are not jurisdictions, only the administrative levels following in the hierarchical structure of PLAC are jurisdictions.
The German GEDCOM-L group did not use SPLAC, because the old PLAC is not what the new object / record will be: PLAC is the name of a place, added with comma separated hierarchical administrative jurisdictions. "SPLAC" is quite another object: it is the object on a certain level of all these buildings, villages, and jurisdictions. It may have children objects (lower level objects) or parent objects (higher level objects). These objects may be places, administrations, religious objects (like churches and parishes), geographical objects (like a region or a continent).
The German GEDCOM-L group called all these objects "locations" and created the tag _LOC for it. These location records are already exported by some of the programs within the GEDCOM-L group, and webtrees has an extension to include this solution, too.
I am with you, that SPLAC is not a good name for these kind of records, as they are NOT shared PLACes, but they are objects of quite another type. They include a lot of data of the substructure of PLAC, but the PLAC itself is modified to a hierarchical system of records and has no longer comma separated hierarchical "jurisdictions"!

The _LOC record has a broad set of subtags. You find it in http://genealogy.net/GEDCOM/GEDCOM551%20GEDCOM-L%20Addendum-R2.pdf on page 13, defined as extension to 5.5.1 (it works with 7.0, too)

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Aug 21, 2024

Albert,

I don’t see any difference between the word “place” (a particular position or point in space) and “location” (a particular place or position) Oxford Dictionary. So the only reason I use the tag SPLAC is because it is a place on a map that is shared and it replaces PLAC at some point. And SPLAC is similar to the new SNOTE.

@Norwegian-Sardines
Copy link

Other software has an extension that extends the PLAC tag to a shared (0 level) position but does not create a hierarchy. So SPLAC can be a used by them without much change in their code!

@mother10
Copy link
Author

Thanks for all input. Now we wait what the committee decides.

@tychonievich
Copy link
Collaborator

Discussed in steering committee

  • There are cases where places and addresses are sufficiently distinct that the current ADDR beside PLAC makes sense, for example when the postal address associated with an event identifies some location other then where the event occurred.
  • There are cases where a place stays fixed and has several different addresses over time, which makes PLAC.ADDR make sense, possibly with DATE and SOUR substructures.
  • There are addresses that cannot be associated with PLAC/SPLAC (like post boxes)
  • The previous points may imply adding ADDR both under PLAC/SPLAC and in their current event location, but that leaves open questions about how to decide where it goes in each situation.
  • There seems to be some agreement that ADDR is "like EXID" in being an identifier which may or may not align perfectly with a PLAC or jurisdiction.
  • While many addresses are related to jurisdiction hierarchies (motivating the support for fields like CITY), others are not. A historical source may provide either address or place or both.
  • There are users who do not use ADDR at all (with reasons for that choice), but also users who do (with reasons as well).
  • GEDCOM-L's _LOC extension has a _LOC.TYPE.DATE, which is valuable in tracking time-varying location types, and may have bearing on some of the points raised in comments above. Some TYPEs are more likely to have ADDRs than others.
  • The "right" tag for this (SPLAC/JURIS/LOC) is complicated because each English word either leaves out some use cases or is somewhat ambiguous. Picking the right tag name will probably come after finishing the specification of the first version.

There are multiple open issues here that deserve additional thought, design, and discussion. Some of that will happen in the open meeting on SPLAC, announced in #538, though this is a big enough topic that the conversation will probably extend beyond that meeting too.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Aug 27, 2024

If the <<Address_Structure>> is to be maintained as subtags of PLAC and Shared_Place_Records I think we should revisit the design of the <<Address_Structure>>. Currently the <<Address_Structure>> has the following definition:

The payload is the full formatted address as it would appear on a mailing label, including appropriate line breaks (encoded using CONT (p.73) tags). The expected order of address components varies by region; the address should be organized as expected by the addressed region.

Which for the most part is only valuable when either viewed by a human or used when creating a mailing label. The individual data points within the structure can not be parsed into meaningful parts and is of little value beyond what we could put into a NOTE except that it is defined as an address.

The other tags found in the <<Address_Structure>> {ADR1, ADR2, ADR3, CITY, STAE, POST, CTRY} (as noted in the GEDCOM Specification) are best used by systems that may "have structured their addresses for indexing and sorting." and are subject to deprecation. This may be a good time (v7.1 GEDCOM) to actually deprecate these subtags. We could also then instead of having a "Structure" just implement the ADDR tag with implied CONT subtags.

@mother10
Copy link
Author

@Norwegian-Sardines
Yes I agree with that!
I think I already said to take out City etc.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Sep 9, 2024

In advance of the upcoming meeting I'd like to express some thoughts for discussion on the topic of the PLAC tag in a future release.

These thoughts are not about a specific solution (structure or design), but instead some of the elements I'd like to see in the solution.

  1. I was trained in science to record my observations, therefore the solution must contain a data element dedicated to the observed "Place of the Event"
  2. Interested parties, present day readers and modern maps need current location names, therefore the solution must contain a data element dedicated as a finding aid that describes the "Current Place Name of the Event"
  3. Because places can be know by several names and/or have different superior jurisdictions in its history, the solution must contain a data element dedicated to "Alternate Names". Optionally Dates and Language information may be included.
  4. A place, is like an Individual, with a history that goes beyond its name, the solution should contain a data element that can be used to record information about the place. History information may directly related the target family or individual. (Language use in the Text should be included in this element)
  5. Images (maps, photos) from/of the place are important to maintain and distribute. The solution should contain a data element that can be used to record and maintain digital images of the place of the event.
  6. Map location coordinates of the place. The place may not be found on general purpose maps, such as graves, homes and other buildings. A researcher may have recorded a location in the field, therefore the solution must contain a set of map elements to record the location of the place.
  7. A way to reuse any of the above data elements for any event that occurred in the same place to reduce redundancy and increase accuracy.

This list can be discussed, revised and augmented and can be used as a starting point for any design ideas.

Thanks

@albertemmerich
Copy link
Collaborator

I would like to add:

  1. Reference to elements in location databases in internet (As by FamilySearch, geonames, gov.genealogy.net)
  2. link to webpages describing the place
  3. type of place. Optionally Dates and Language information may be included.
  4. Zip code of place. Optionally Dates information may be included.
  5. Pointers to higher level elements (political, religious, geographical hierarchies). Optionally Dates and Type information may be included.
  6. Source citations for all data

Adding to 6.: for bigger elements not only coordinates, but the area

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Sep 9, 2024

I think addition number 12 references design of the solution rather than a desire! A solution may not include "higher levels" as indicated in previous discussions!

But this can be discussed in detail tomorrow.

@albertemmerich
Copy link
Collaborator

I agree "pointer" is part of a solution. Desire is to document all hierarchical connections in between places on different levels at any time.

@mother10
Copy link
Author

Maybe the <<Address_Structure>> as @Norwegian-Sardines said, should be an ADDR with CONT and CONC indeed.
But I wonder if ADDR should not have a TYPE too.
One TYPE is just for addressing labels, the other TYPE is for describing the place of an Event.
I can have a child born at ChurchRoad 12, but I am not able to give birth to a child inside P.O box 1234.

And the Addressing label type, should not that have the possibility to have a Barcode added?
As when I send/recieve parcels I see everyone scanning barcodes, and when you send a parcel, it gets a barcode that is scanned at every step the parcel passes.
Dont know if that fits in the coming discussion too.

@Norwegian-Sardines
Copy link

Maybe the <<Address_Structure>> as @Norwegian-Sardines said, should be an ADDR with CONT and CONC indeed. But I wonder if ADDR should not have a TYPE too. One TYPE is just for addressing labels, the other TYPE is for describing the place of an Event. I can have a child born at ChurchRoad 12, but I am not able to give birth to a child inside P.O box 1234.

And the Addressing label type, should not that have the possibility to have a Barcode added? As when I send/recieve parcels I see everyone scanning barcodes, and when you send a parcel, it gets a barcode that is scanned at every step the parcel passes. Dont know if that fits in the coming discussion too.

What I do for Addresses of places is:

2 PLAC ChurchRoad 12, My Town, My State, My Country

I don’t need an <<Address_Structure>> for this case.

@Dirk-Ahnenblatt
Copy link

What I do for Addresses of places is:

2 PLAC ChurchRoad 12, My Town, My State, My Country

I don’t need an <<Address_Structure>> for this case.

Some comments from a genealogy software provider ...

  1. This may work for a residence (although I'm wondering whether it shouldn't be “12, ChurchRoad, ...”, or whether “My Town” should also include the zip code), but what about other addresses like hospitals or cemeteries.

1 BIRT
2 PLAC University Hospital, ChurchRoad 12, My Town, My State, My Country

1 BURI
2 PLAC Central Cemetery, ChurchRoad 12, My Town, My State, My Country

One more hierarchary level. All possible, but makes it not easier, because not needed in every case.

  1. GEDCOM is just a transport medium for genealogical data, but the real usage is in genealogy software. What would be the output for i.e. birth place in your example? The whole PLAC string? Or just the first part and truncated from the first comma?

That's why I am not a big fan of these hierarchy in PLAC tags and prefer the town/city on first place in PLAC tag (What is your birth place? ChurchRoad 12!).

Conclusion:
I would prefer not only <<Address_Structure>> but also ADDR records, because there is a lot of redundancy in a single GEDCOM file if all residences are filled and additional information (i.e. coordinates) would be a benefit.

Dirk (www.ahnenblatt.com)

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Sep 11, 2024

My birth place:

2 PLAC City Center Hospital, Birth Town, Birth County, Birth State, USA

One more hierarchary level. All possible, but makes it not easier, because not needed in every case.

What does this mean? If it is not needed don’t put it in!

If I did not know the hospital name,
2 PLAC Birth Town, Birth County, Birth State, USA

What would be the output for i.e. birth place in your example? The whole PLAC string? Or just the first part and truncated from the first comma?

In my software two things are output:

  1. A map based off the PLAC data
  2. The PLAC data.

Since GEDCOM v5.5.1 does not have a shorter (abbreviated) value the whole string is output on the screen and/or report. Simple!

NOTE: My proposal would be to include an abbreviated place name!

@Norwegian-Sardines
Copy link

This may work for a residence (although I'm wondering whether it shouldn't be “12, ChurchRoad, ...”, or whether “My Town” should also include the zip code), but what about other addresses like hospitals or cemeteries.

  1. You could put the 12 before the street in some countries they say the address, street then house number. In my area we say the house number first.
  2. The comma after the number is ok, I’d prefer it without. In GEDCOM the comma represents a different level in the hierarchy and in my software this means the street gets a map location as well as the house and the city.

“My Town” should also include the zip code

If it has to be there then include it, but in the USA a town could have dozens of zip codes and it provides no extra value locating the town on the map. In another country the zip code may be assigned to a town and used to differentiate two towns with the same name such as, “MyTown 1234, Country” from “MyTown 5678, Country”.

@Dirk-Ahnenblatt
Copy link

My birth place:

2 PLAC City Center Hospital, Birth Town, Birth County, Birth State, USA

One more hierarchary level. All possible, but makes it not easier, because not needed in every case.

What does this mean? If it is not needed don’t put it in!

If I did not know the hospital name, 2 PLAC Birth Town, Birth County, Birth State, USA

The meaning of each part between the commas should be defined in HEAD.PLAC.FORM. In your case ...

1 PLAC
2 FORM Place name, Street, City, County, State, Country

... where "City Center Hospital" is the Place name.
When using hierarchical place names you can omit one part (like Place name ) but have to use the same amount of commas. Otherwise the PLAC value can't be parsed and is just a line of text.

About street numbers and zip code:
I thought you want to have a hierarchical place structure within PLAC tag defined by HEAD.PLAC.FORM. It seems that I was wrong.
I thought you use the PLAC value as a replacement for the address structure. But you are not interested in zip code (as in ADDR.POST).
Your PLAC value is just a line of text. It could start with a place name (i.e. hospital name), street or city.
So statistics about most used places in your software are not possible.

I personally would prefer to have only City as PLAC value and more details in _LOC/SPLAC record (i.e. State, Country) and <ADDRESS_STRUCTURE> (or SADDR record - i.e. name of hospital or cemetery).

Dirk

@Norwegian-Sardines
Copy link

I don’t use PLAC.FORM it has no value to me or anyone that uses my software!

Place Statistics start with the highest order (country), all levels should be consistent as you go down. I see no problem!

If I need to know the number of people that had an event in a city that city is described once and reused!

@Norwegian-Sardines
Copy link

Dirk - I thought you use the PLAC value as a replacement for the address structure. But you are not interested in zip code (as in ADDR.POST)

I personally think the Address_Structure as defined in the GEDCOM v5.5.1 specification has no value in a Genealogical Database.

v5.5.1 - The address structure should be formed as it would appear on a mailing label using the ADDR and the CONT lines to form the address structure. The ADDR and CONT lines are required for any address.

This structure (as described above) is best used to send letters and mail to the individual. First, a large percentage of individual in most genealogical databases are dead so they will not be getting mail from me! Second, Using the Address_Structure for anything other than a person's residence does not make sense, why send a letter to their church, hospital, or the place they lived in a census 50 years ago? Third, I am not a fan of having address information for the living in a GEDCOM that can be shared to and used by others. Privacy for the living is important and we have no way to prevent a PLAC from being transmitted with an address.

However, I can understand if you want to record an address for an individual's residence. I think it should be record in the RESI as a line of text maybe as part of a SNOTE which has Privacy coverage and can be prevented from being transmitted.

NOTE: ADDR.POST will probably be deprecated in the future! The v5.5.1 GEDCOM specification says:

v5.5.1 - The additional subordinate address tags such as STAE and CTRY are provided to be used by systems that have structured their addresses for indexing and sorting. For backward compatibility these lines are not to be used in lieu of the required ADDR and CONT line structure.

I am interesting in zip-code if it defines a place uniquely where using just the city or region does not.

@albertemmerich
Copy link
Collaborator

If you do not use PLAC.FORM, how do you know what place this PLAC:
2 PLAC Kansas, USA
describes? A city? A state?
Without link to any description uniquely defining the place, the PLAC.FORM is one of the most important features in GEDCOM up to 7.0. In 7.0 we started to have PLAC.EXID which for the first time was a possibility to identify a place without the hierarchical structure in PLAC payload.
Now I can do:

2 PLAC Kansas, USA
3 EXID KANSASEM33MU
4 TYPE http://gov.genealogy.net/

So it is the city in Clark County in the Arkansas state.
But now, how do you find out, which other cities in Arcansas are in your data?

With the hierarchical structure of _LOC records as defined by GEDCOM-L group no problem:
Pick all records which have Arkansas record as next level record, pick again all next sublevel until you find the elements on level "city". If you want to know all lower levels, too: No problem, look for sublevels until there is no element found any more.

@Norwegian-Sardines
Copy link

If you do not use PLAC.FORM, how do you know what place this PLAC:
2 PLAC Kansas, USA

I don't really care what type of place this is! It is the place where an event happened. If I take the text "Kansas, USA" and put it into Google Maps it will find the State and it will show on a map. If you transmitted "Chicago, USA" or "Chicago, Illinois" it would find it too!

If I transmitted "Kansas City, USA" it would need more clarification, Missouri or Kansas? So you should be transmitting:
PLAC Kansas City, Kansas, USA
or
PLAC Kansas City, Missouri, USA

This is why it is imperative that a unique name be transmitted that can be found on a map! How do you know what/where is with the PLAC.FORM:
2 PLAC Lincoln, USA
3 FORM City, Country

@albertemmerich
Copy link
Collaborator

Zip code is one of the most safe ways to identify a place, where the same name is used for several places. So it is one of the easiest ways for a safe description of places. I have seen a lot of GEDCOM files in the wild using the zip code within the PLAC payload of 5.5 / 5.5.1, like
2 PLAC Hanstedt, D-27793
or
2 PLAC Hanstedt, D-21271
or
2 PLAC Hanstedt, D-29525
All of them are today in the state of Lower Saxony (Germany), the first in the county of Oldenburg, second in county of Harburg, last in county of Uelzen. As the administrative organisation is modified more often than the system of zip codes, it is a good way to identify places. And used in the wild despite violating the GEDCOM standard, as the zip code is not an administrative level in the place hierarchy. But at import I can identify those places automatically by searching for the zip codes in the GOV system on genealogy.net.
GEDCOM-L has put the zip code as one of the subtags in their location records, using tag POST. I use it very often. I would like it to see it in a future place record, else I had to implement an extension tag _POST in the record.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Sep 11, 2024

Oh, and by the way. For my clients and family in Norway I would actually transmit or display.

2 PLAC Kansas State, USA

Just Like I would enter:
2 PLAC Rogaland Fylke, Norway

or

2 PLAC Chicago, Cook County, Illinois State, USA

or

2 PLAC New York City, New York State, USA

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Sep 11, 2024

If I used a well managed, maintained and curated offsite database then these values would have a more disciplined naming structure and therefore also not need PLAC.FORM because the offsite database would be the expert. For most individuals in the wild (not you) they don't care about knowing about or using PLAC.FORM it is only when you are building a repository like FamilySearch or geonames that they care, Most real users and software implementers would use these resources to create a "correct" place hierarchy and not be creating the resource on their own,

@Norwegian-Sardines
Copy link

Zip code is one of the most safe ways to identify a place, where the same name is used for several places. So it is one of the easiest ways for a safe description of places.

I agree to a point. In my town of 47000 people we have 9 different zip codes, which one do I use?

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Sep 11, 2024

Personally in your case were each town has only one zip code you could enter the following:

2 PLAC Hanstedt D-27793, Oldenburg County, Lower Saxony, Germany
or
2 PLAC Hanstedt D-21271, Harburg County, Lower Saxony, Germany
or
2 PLAC Hanstedt D-29525, Uelzen County, Lower Saxony, Germany

But in really in my example the county without the zip code still makes the place unique.
So you could also have:
2 PLAC Hanstedt, Oldenburg County, Lower Saxony, Germany

NOTE: I removed the comma between the town and the zip code because they go together to make a unique place and should not violate the GEDCOM Standard for creating a non administrative level.

@Dirk-Ahnenblatt
Copy link

Dirk-Ahnenblatt commented Sep 13, 2024

Now I have read the complete discussion here and would agree to all of @mother10 's proposals.

As I understand PLAC is going down just to the city and gets an SPLAC record for more detailed information. ADDR is in my view not just a postal address, but a more detailed description of the place, building or location. This will get BLDNG records to avoid redundancies and add more details like coordinates.
I personally would prefer the term "location" (instead of building). Otherwise we would see the term "Building" in all future software dialogs.

What I also like is the idea of a new tag CITYNAME. I would wish that it is not necessary, because PLAC has already only the value which is presented to the user (like CITYNAME purpose). But there are so many GEDCOMs around which have these comma-formatted PLAC values.

About PLAC, FORM and commas:
FORM was introduced with GEDCOM 5.3 to allow a hierarchical structure of the place name in order to specify a place more precisely and distinguish it from places with the same name.
But this seems to be only a work-around because a lack of better options (such as SPLAC records).

In GEDCOM 5.3 there are several samples ...
FORM city, county, state, country
... and ...
FORM City, County, State, Country

No pre-defined jurisdictions? Should all terms be written in lower case or does the spelling not matter? Do these terms always have to be in English? What about the use of additional jurisdictions? Each software can invent "own" jurisdiction terms (like "building name" or "street with number"), which in the end only exist in this software. And there is software which doesn't use FORM tags, even if users enter these comma-separated PLAC jurisdictions (on their own risk).

This would be no problem if software handles this comma-thing internally and presents users only the name of the city. But to keep the right hierarchy levels of a PLAC value is in most cases the duty of the user.

I never came across software that is checking numbers of commas in all PLAC values and gives warnings. What about merging/adding/importing GEDCOM files? Will FORM values and PLAC values/hierarchies (number of commas) be changed? It should be checked - but in reality it is not.

Every software has built its own strategy how to deal with these comma-separated values (including "empty" commas) when generating outputs.

In my eyes, with introducing SPLAC, the PLAC/FORM comma-separated hierarchy structure should be declared dead. There should be no inforcement in the GEDCOM documentation to still do it. It should be replaced by new SPLAC + BLDNG records (final name of BLDNG record can be discussed - I would suggest LOC).

Dirk

@mother10
Copy link
Author

Hi, it took a while. But I finally managed to get my draft pullrequest up there: #552 .
It is an attempt to deal with the problems mentioned in the Zoom session.

It has an extra Examples page to see the proposal "working".

After I had worked out the PERIODS a bit, I also wanted to see how things might look with [GEDCOM-L's _LOC extension] inside.
Its a draft so just a proposal.
Its something to look at and discuss.

The problem about a town in Serbia, "moving" over different parents, was really huge, so I made a scheme first before I tried to put that in SPLAC's .

Maybe start with the example page before diving into the specifications.

Have fun!
Tineke

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Oct 1, 2024

Meanwhile I've put together a design document for an alternative SPLAC (Shared_Place_Record) that does not use GEDCOM-L type hierarchy due to the problems pointed out here regarding:

the supposed "advantage" of the hierarchical _LOC structure is the ability to add dates to the relationships. Thus you can represent situtations such as Gdansk being variously in Poland, Germany, Russia, Prussia, etc.

Which would require every GEDCOM to include the same historic geographic database. This historic geographic database ought to be part of a genealogy application - not included in every genealogy file.

Unfortunately I'm not fully acquainted with GITHUB PR, check out and the like so I've created what I normally do when I propose an update to a document or application:
Proposal - Place Record.pdf

This is my first blush at the topic, so I may revise it as I get input from others, or see flaws in the document.
Thanks

@mother10
Copy link
Author

mother10 commented Oct 1, 2024

Maybe I should clarify a bit about that huge Serbia-Subotica example in my PR:

That I entered all that info, does NOT mean a user should dig up all that information.
If a user only has 1 event say in 1917, all he needs for his SPLAC is the information about the place and its parent, FOR THAT YEAR. Not that whole scheme.
If later he discovers an event 2 years after that in 1919, he will need to only add the info for that year, for that place and its parent.

That in my examples I wrote out a lot coming from that scheme, is just to give an idea how it will look if there are events during that whole period.

And the only things about a place and its parent(s) that will be in the GEDCOM is statistical info about the location, the parent, its name and such.
As those could have been of great influence on someones life.

The GEDCOM will not contain an historical treatise about how that place was conquered defended and all that kind of thing. So certainly not the whole historybook about that place. Just the basic facts needed to define it in that time period. To help clarify why and how someone in our GEDCOM did and lived like he did in that time.
Nothing more.

Thats what I think.

About not being able to put up a PR:
I can agree it was not easy and very frustrating doing it for the first time.
But as you can see its possible even for someone not being a collaborator and having no more than a simple Github account.

But I would really really have liked a simple step by step of how to do a PR. There is lots of info in the Github docs, I read it many times, but it still was very difficult, and I got lots of complaints by the Github system, refusing my uploads.
Fortunately I got some advise from Luther which finally helped me to figure it out. (Thanks for that!!)

@albertemmerich
Copy link
Collaborator

@Norwegian-Sardines: Thank you for providing your proposal.

For me the hierarchical relations in between the SPLAC records are essential, and they are missing in the proposal. By this the proposal is requiring to put in the hierarchy by list into every record by SPLAC.PLAC.
Let us take the county of nowadays Heidekreis in Lower Saxony, Germany. Until 2011 it was named "Landkreis Soltau-Fallingbostel". And in 1977 it was established by merging the former counties of Soltau and Fallingbostel.
So we have in the SPLAC record in your version
2 PLAC Tetendorf, Heidekreis, Niedersachsen, Deutschland
3 DATE FROM 2011
2 PLAC Tetendorf, Soltau-Fallingbostel, Niedersachsen, Deutschland
3 DATE FROM 1977 to 2011
and so on. However for me this is to be put to the record of county Heidekreis, and the SPLAC record for Tetendorf pointing to the SPLAC record of Heidekreis. GEDCOM-L simply allows
2 NAME Tetendorf
in the _LOC record. The identification is NOT made by any hierarchical list.

What about
2 PLAC Soltau, Niedersachsen, Deutschland
?? Here it is not clear, whether it is the city of Soltau (we have only one city "Soltau" in Lower Saxony/Niedersachsen, or whether it is the county in Lower Saxony/Niedersachsen. What is missing? The type of the location: TYPE. GEDCOM-L allows to classify the records by its type. And this includes church structures, so you can document, in which parish your village is, of which churches are operating in a city.

Next point: In your proposal the offical name of a location is missing. In many cases it is not in the source of the event (as there we often find additional data to identify the location, but not the hierarchical list you are using in your SPLAC record). The name in the example is "Tetendorf" (given in official registries of places in Lower Saxony), in the genealogical source it may be given as "Tetendorf bei Soltau" (which you put in the calling record, same in my application), but nowhere you have Tetendorf, Heidekreis, Niedersachsen, Deutschland or any thing like that in the sources. That is an artificial construction which is not needed anywhere.

Using the hierarchical structure of GEDCOM-L location records, my application can find all people with any event (or other search criteria as name, occupation, year of birth) in the county of Heidekreis. The application does this by checking the record county Heidekreis, and all "sub"records linked to Heidekreis in any generation. I will implement an extension to build the hierarchical structure of location records as long as GEDCOM standard does not offer it itself to make sure this functionality is covered by GEDCOM transmission.

You argument, the GEDCOM-L solution implies that all applications / genealogies must implement the same data of places, does not work for me. I have the external link to GOV (gov.genealogy.net), and when importing an additional GEDCOM file my application automatically merges location records with same ID in GOV. So it is open to every researcher, which data he puts in the record. GOV offers a webservice, so my application can optionally download the data from GOV. As this is a very time consuming process, it is much faster to transfer the downloaded information by GEDCOM than to download the same information again and again. Especially looking for all people in a county would not run if the hierarchical structure within the county had to be downloaded first in this call.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Oct 3, 2024

I waited a while to respond to Albert in case a nonpartisan question was ask or observation was made.

Sorry for the lengthy reply!

Albert: For me the hierarchical relations in between the SPLAC records are essential, and they are missing in the proposal. By this the proposal is requiring to put in the hierarchy by list into every record by SPLAC.PLAC

Replay: First, why is it “essential” for the transmitting application to provide a hierarchy of superior entities as separate set of record instances in the GEDCOM? Second, when an application user identifies a place in their genealogy, they most likely will not know or have access to a correct list of entities in the place hierarchy. Their software may or may not include a mechanism that provides a hierarchy of superior entities, if they have a list, that list may be inaccurate, not reliable for the time of the event or match the list provided by GOV.

It is not possible for all GEDCOMs to provide a complete list of superior place entities and while many applications try to create the illusion of completeness, they are not 100% complete. GOV itself has missing levels in many of its place entities and has many missing historical and modern-day combinations.

In the proposal a 100% complete “list” is not required at all, it does not have to have all levels of hierarchy. The “list” only needs to contain enough information to uniquely define the “Place of the Event” so that it can be found on a map or in person.

Albert: Let us take the county of nowadays Heidekreis in Lower Saxony, Germany. Until 2011 it was named "Landkreis Soltau-Fallingbostel". And in 1977 it was established by merging the former counties of Soltau and Fallingbostel.

Reply: An application user would see a Source Artifact stating that an event occurred in Tetendorf, Soltau, Deutschland and enter it in the .PLAC tag. At some later date they could create an SPLAC record that has a SPLAC.PLAC entry of Tetendorf, Heidekreis, Lower Saxony, Germany. The application user probably does not care about the intermittent names or understand the dynamics of the history of German States and Districts, just the current name. If they were interacting with German speakers, they may want to include a German spelling of the place, but it is not required. The transmission would contain:

1 EVEN
2 PLAC Tetendorf bei Soltau, Deutschland
3 SPLAC @P1@

0 @P1@ SPLAC
1 PLAC Tetendorf, Heidekreis, Lower Saxony, Germany
2 LANG en
1 PLAC Tetendorf, Heidekreis, Saxony Niedersachsen, Deutschland
2 LANG de

Albert: What about 2 PLAC Soltau, Niedersachsen, Deutschland?? Here it is not clear, whether it is the city of Soltau (we have only one city "Soltau" in Lower Saxony/Niedersachsen, or whether it is the county in Lower Saxony/Niedersachsen. What is missing? The type of the location: TYPE. GEDCOM-L allows to classify the records by its type. And this includes church structures, so you can document, in which parish your village is, of which churches are operating in a city.

Reply: If Soltau is unique within Niedersachsen then why do I as the transmitter care if the entity is a city, farm, county? I can enter Soltau, Niedersachsen, Deutschland into any search engine and it will return the German town that is Soltau! For that matter it looks like Soltau is unique within Germany as well. A Germany novice may only know a minimal amount about the various entities in Germany and can’t know that Soltau is a city, farm or other and can’t be required to enter the complete hierarchy, whether that hierarchy is a list or a set of hierarchical records!

I’ve never seen a major application have an input field for the PLAC.FORM information. I’ve also asked around to people who have used other software that I have not used (they have listed several dozen). None of them indicate the use of or possible entry of the PLAC.FORM payload. Additionally, the FORM payload has no defined set of values, or languages to use. A term used in one country means something different in another country. The use of a TYPE value is fine for the GOV database and is helpful for a GOV subscriber. It is useless for individuals that don’t subscribe to GOV, speak languages not found in GOV, or have entities they what to include in their transmission not described in the GOV database. This concept should be contained as part of GOV (as a source of information) not in a GEDCOM transmission.

Albert: Next point: In your proposal the offical name of a location is missing. In many cases it is not in the source of the event (as there we often find additional data to identify the location, but not the hierarchical list you are using in your SPLAC record). The name in the example is "Tetendorf" (given in official registries of places in Lower Saxony), in the genealogical source it may be given as "Tetendorf bei Soltau" (which you put in the calling record, same in my application), but nowhere you have Tetendorf, Heidekreis, Niedersachsen, Deutschland or any thing like that in the sources. That is an artificial construction which is not needed anywhere.

Reply: The GEDCOM should not be responsible for transmitting the official name unless that is either part of the “Source Artifact”, or the creator of the transmission enters it in the SPLAC when they create the record. The transmission does not require (nor do most applications) or care about the official name of the entity. Most GEDCOMs I have received Do not Spell out “United States of America”, “Kingdom of Norway “, “Kongeriket Norge” (in bokmål), “Kongeriket Noreg” (in Nynorsk) or “Bundesrepublik Deutschland”. Unless the application subscribes to a geo-locating and describing database, the GEDCOM data sent in the transmission is completely the responsibility of the data entry person.

Albert: Using the hierarchical structure of GEDCOM-L location records, my application can find all people with any event (or other search criteria as name, occupation, year of birth) in the county of Heidekreis. The application does this by checking the record county Heidekreis, and all "sub"records linked to Heidekreis in any generation. I will implement an extension to build the hierarchical structure of location records as long as GEDCOM standard does not offer it itself to make sure this functionality is covered by GEDCOM transmission.

Reply: In your example this is true, If and Only If, Heidekreis is unique within the database. If the city name was “Paris” we could have multiple record instances within a single database. If the application subscribes to the GOV database a GOV-ID may be present, but it is very presumptive that any application will automatically subscribe to this database. Rather, a software application can subscribe to any number of providers, create their own database or have no geolocation database. A second problem that if an application does subscribe to the GOV database, does that application bring in all of the data found in the database for the entity or just what the application user needs or cares about. For example, they may only care about the entities current name and locator. The rest could be in their GEDCOM creating unnecessary bloat!

The information about the hierarchy of entities in the GOV database is great as a reference for entities they support, but GOV does not support all place entities in the world where an event may occur. Keep this information and the hierarchy with GOV, don’t carry it within the GEDCOM transmission.

Albert: You argument, the GEDCOM-L solution implies that all applications / genealogies must implement the same data of places, does not work for me. I have the external link to GOV (gov.genealogy.net), and when importing an additional GEDCOM file my application automatically merges location records with same ID in GOV.

Reply: The proposal has no issue with linking to the GOV database, this linking is provided by the proposal. We need to separate what the GOV Place Directory (“GOV”) does from the function of the GEDCOM transmission! GOV as a tool is a fine place (although incomplete) to research and document places in the world. The GOV-ID can be used as a source of information about a place entity and referencing GOV adds some value to a GEDCOM. However, the structure that GEDCOM-L suggests for inclusion in Standard GEDCOM, which I can only assume mimics the structure of the GOV database, increases the size of a transmission by its need to create record instances for all entity levels superior to the targeted place. These additional record instances have limited value in the transmission.

The example above, “Tetendorf, Heidekreis, Lower Saxony, Germany” requires four SPLAC records, one for each of the levels in the hierarchy to be created and transmitted. The three additional record instances (for Heidekreis, Lower Saxony, and Germany) may never be referenced by any Fact/Event PLAC tags within the transmission and therefore are overkill.

In my proposal only one SPLAC instance is required. For application databases that use the GEDCOM record structure in some form as their schema the overhead of multiple SPLAC levels would be problematic especially if a place entity could have multiple superior place entities based on time or language. Language comes into play when an entity was part of a French speaking superior entity and later became part of a German Speaking superior entity.

A far bigger issue is when the bottom level is a common entity name like “Paris” or “Lincoln” where locating the correct Lincoln or Paris requires at least one (possibly more) higher level entity instances to be called to determine the correct target entity (“Paris”). Before the SPLAC can be connected to the Fact/Event all possible Paris or Lincoln combinations in the GEDCOM must be reviewed to determine the correct one (and if none is found then a new SPLAC hierarchy added) before the SPLAC can be connected. For example, the user knows that the person was born in Paris, Ohio, under the Multi-level SPLAC design the application will have to look several levels up to determine if they have Paris, Ohio (in the correct township and county) or Paris, Michigan or any one of more than a dozen cities named Paris in the USA. There are more than 40 Cities named Lincoln in the USA and more in the world!

Additionally, using the GEDCOM-L record (again we are not talking about the GOV database) where an entity has multiple superior entities, such as your “Tetendorf” example that is based on date, we could take different paths to find the correct superior entity, (Soltau, Landkreis Soltau-Fallingbostel, or Heidekreis). The user of the application may not have or know the date when the event occurred and therefore could either not take the right path to the next level or not select any next levels! If the user creates an SPLAC record themselves without knowing all the levels, they could just enter Tentendof, Germany and be done with the data entry at that time!

@albertemmerich
Copy link
Collaborator

albertemmerich commented Oct 3, 2024

Soltau is not unique in Lower Saxony. It was a county, and it is a city. If the data transferred by GEDCOM neither have the type of the place, nor the next level object, an user not familiar with the situation will treat the place "Soltau" as a city. And if that is wrong, because the original source cited the county, then the user will not find any sources of the event in any archive of the city of Soltau, if the event took place in another place within the county of Soltau, and not in the city of Soltau.
I very often see this problem, when English speaking people ask for data of Hanover. They have a source telling the person came from Hanover, and are not successful to find any record of the person in Hanover, looking in the city Hanover. They won't, as in many cases the source cited the kingdom of Hanover (with same king as Great Britain). Therefore they have to look for the person in the former kingdom Hanover, which has a similar areas as todays state of Lower Saxony.
This is an often seen problem, if the type is not correctly given. In Germany you have thousands of examples where the name of the place can be two, in many cases even more levels. Oldenburg is a city (we have a lot of cities called Oldenburg in Germany), is a county, and was a state. So the source telling "he was coming from Oldenburg" needs a very broad research to identify the correct place. If you have done ´this, it should be documented very well.

Next point is: Putting in a place, I only need to identify the next level place object and link to it. For all places linked to the higher level place object, I only once have to put its data including its next level object (or more, if there were more depending on date periods). By the link the data structure automatically transfers this information to all lower level objects. It is much easier than to look for the complete hierarchy again and again for all lowest level places. So within a place object, no other information about any higher level object is needed only the link to next level. This reduces the time to build the place records, and it reduces the size of data in the GEDCOM file, as multiple information about in which state a county is located, is not written for all places in that county, but only for the county itself. If you want, you can create the list of hierarchies for a place from the hierarchical database, and put it in your reports. And you can do that with the hierarchy exactly matching to the date of the event. That is the only use case where you really need this hierarchical list.

All the other points: We will not find together. I need the data, @Norwegian-Sardines does not. So I need to export them, he does not. As long as GEDCOM standard does not offer the possibility to export the data I need, I stay with the extensions we have agreed on in GEDCOM-L list.

The question is, whether other applications use the link to next level objects, too. Then this link should be within GEDCOM standard so these applications can share the data.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Oct 3, 2024

Soltau is not unique in Lower Saxony. It was a county, and it is a city. If the data transferred by GEDCOM neither have the type of the place, nor the next level object, an user not familiar with the situation will treat the place "Soltau" as a city.

This is 100% true and if I found a source (like a bible) and it said the person was born in Soltau, Germany. I would not know that they were born in the city or the county. Under GEDOM-L design I could only create a 2 SPLAC record instances 1) Soltua 2) Germany because I don't have access to anything that might help. If I thought to go to Wikipedia I might find that it would say:

is a mid-sized town in the Lüneburg Heath in the district of Heidekreis, in Lower Saxony, Germany

So do I create three additional SPLAC records (Lüneburg Heath, Heidekreis, and Lower Saxony) and then I would not know what TYPE to use for "Lüneburg Heath" and "Lower Saxony". By then I'm getting bored with data entry and saying forget it "Soltau, Germany it is"!!!

As you have noted People not familiar with the setup of divisions in other countries (and possibly their own) don't invest time in researching the place. Most people living in the Chicago, Illinois, USA area don't know that "Cook County" (the county which Chicago and many of its suburbs) has more than 30 Townships (a division between county and city) and never input the division. Some cities are found in multiple counties, which one is the next higher level in GEDCOM-L?

I don't understand your second point! If I have 8 Paris City SPLAC instances in your GEDCOM-L hierarchy and only know that it is in the State of Ohio, I don't know the potential two levels (county and township) between the State SPLAC instance and the City SPLAC instance.

@Dirk-Ahnenblatt
Copy link

@Norwegian-Sardines :
Some comments on your proposal.

  1. I had hoped that the comma-separated location entities would be replaced by SPLAC records. Your answers after that put them into perspective again (from your side, no claim to completeness).
    In your proposal, the comma-separated elements make it seem as if nothing has changed compared to older GEDCOM versions, so I would appreciate it if examples such as...
PLAC Paris/Texas (USA)
PLAC Paris; Texas/USA
PLAC Texas (town Paris)

... or similar were also included.
How is a beginner to know that he has to enter comma-separated elements in ascending order in his software?

  1. It should be clearer which of the place texts would then be used for output. Especially with extensive list output, a place does not have to be repeated each time with all its place entities.

Based on your example...

1 BIRT
2 PLAC Paris
3 SPLAC @P1@

... I think it should be the PLAC information (here: Paris). A corresponding statement/recommendation in the text would provide clarity (or have I overlooked it?).

  1. In example 4, it says...
0 @P5@ SPLAC
1 PLAC John Doe Grave, Highland Cemetery, Washington Township, Ohio, USA

I would not see the new SPLAC structure as a replacement for ADDR and would rather enter “Highland Cemetery, John Doe Grave” as ADDR. For me, ADDR are therefore not always mandatory postal addresses.

  1. In example 4, the ABBR tag is also used, intended for an abbreviation. However, the text is much too long for an abbreviation. I use ABBR in the form that a place like Hamburg (PLAC Hamburg) can then have its own abbreviation (e.g. ABBR Hmbg), which is used in extensive lists/books to save space. The software can then generate an additional glossary page in which, among other things, “Hmbg = Hamburg” can be found as an explanation. Or have I misinterpreted ABBR here?

  2. I don't particularly like the combination of PLAC and DATE to document historical name changes of the place (example “Oslo”). What should my software do with it? What can I do with the information as a user? “Olso Fylke” would confuse me as a user. I don't speak Norwegian. Is this a two-part place name or a street name? Is this PLAC/DATE listing correct at all and where does it come from (don't you also have to insert a SOUR?)?
    I could therefore do without DATE information.

  3. Just a suggestion: the PLAC record contains the place name (PLAC), a tag for an international country code (e.g. ISO 3166-1), coordinates (MAP.LONG/LATI), note (NOTE) and source (SOUR), as well as an identifier for GOV and similar services (e.g. with EXID). Isn't a location sufficiently described by its coordinates? As an alternative to GOV, I quickly found this service:
    Not perfect – but there are certainly others.

Dirk

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Oct 4, 2024

Dirk: I had hoped that the comma-separated location entities would be replaced by SPLAC records. Your answers after that put them into perspective again (from your side, no claim to completeness).
In your proposal, the comma-separated elements make it seem as if nothing has changed compared to older GEDCOM versions, so I would appreciate it if examples such as...
Reply: To be honest, I had hoped that a hierarchical system of SPLAC records could have been implemented, but I saw too many problems with this implementation. I’m hoping that my interaction with Albert sheds light on why I think a record-based hierarchy is not possible.


Dirk: How is a beginner to know that he has to enter comma-separated elements in ascending order in his software?
Reply: All of the software I have dealt with to this point tell its users how to enter Place information in their help text. These applications already understand how to help their user base enter the data. Many users never even think or care about GEDCOM, but if they do the software currently creates a GEDCOM with comma separated places from their input. We are creating a protocol for transmission of data, the application can ask for data in any form they want, it is the application’s job to understand GEDCOM and create the output as The Standard dictates.


Dirk: It should be clearer which of the place texts would then be used for output. Especially with extensive list output, a place does not have to be repeated each time with all its place entities.
Reply: My hope is that applications would use the ABBR(abbreviation) payload from the primary PLACpayload as their display value. A SPLAC record has a SPLAC.PLACpayload and a SPLAC.PLAC.ABBRpayload, the Paris Texas SPLAC record could look like this:

0 @P1@ SPLAC
1 PLAC Paris, Lamar County, Texas, USA
2 ABBR Paris Texas
2 DATE AFT 1845
1 PLAC Paris, Red River County, Republic of Texas
2 ABBR Paris, Republic of Texas
2 DATE BEF 1840
1 PLAC Paris, Lamar County, Republic of Texas
2 ABBR Paris Texas
2 DATE FROM 1840 TO 1845

NOTE: This record includes multiple Place Descriptions based on the history of the city over time. Only the first PLACpayload is required. A report created from this record could either display the SPLAC.PLACpayload or the SPLAC.PLAC.ABBRpayload from the primary (first) PLAC However, if a relative was born in Paris before Texas became part of the USA, the history may be important to a user and they wanted to display “Paris Republic of Texas” rather than “Paris USA”.
This example outlines one issue with the GEDCOM-L multiple level SPLAC design. An SPLAC record would need to be created for each entity presented.

  1. Paris, 2) Lamar County, 3) Texas, 4) USA, 5) Red River County, 6) Republic of Texas

Paris would need two date-based links to superior levels 1) Lamar County 2) Red River County
Lamar County would need two date-based links to superior levels 1) Republic of Texas 2) Texas

The second issue is that the Record Instance of “Paris” has no identifier in it to indicate that it is the Paris that will eventually get us into a Texas land mass. Paris Ohio would start with Paris at the lowest level too!


Dirk: I would not see the new SPLAC structure as a replacement for ADDR and would rather enter “Highland Cemetery, John Doe Grave” as ADDR. For me, ADDR are therefore not always mandatory postal addresses.
Reply: I do see the place of John Doe’s grave as just another place entity. I map them with an exact GPS locator, I can’t do that with an Address that does not have an SPLAC record associated with it! This also holds true for any place that I can map with a GPS locator. You may not do this, and this is fine, but I think readers of my website or book would like to be able to see where in the world a person was born, lived, died, and not just the closest city!


Dirk: The software can then generate an additional glossary page in which, among other things, “Hmbg = Hamburg” can be found as an explanation. Or have I misinterpreted ABBRhere?
Reply: Either interpretation can be used although I don’t think abbreviating Hamburg is valuable, but to each his own. In my example above the Abbreviation (ABBR) could be “Paris TX” but how many of my readers in Norway, Germany, England and other places know what TX means? I would not know what Hmbg means, could it be “Humbug”?


Dirk: I don't particularly like the combination of PLACand DATE to document historical name changes of the place (example “Oslo”). What should my software do with it? What can I do with the information as a user? “Olso Fylke” would confuse me as a user. I don't speak Norwegian. Is this a two-part place name or a street name? Is this PLAC/DATE listing correct at all and where does it come from (don't you also have to insert a SOUR?)?
I could therefore do without DATE information.
Reply: What would you do differently to document and maintain historical place names? I’m open for suggestions. A relative of mine could have been born in Christiania, Norway, but died in Oslo, Norway. I would need to know that these are the same place so I can link to the same SPLAC record instance for both events, otherwise a user of my site could want to create a new SPLAC record instance. The DATE payload only helps to create a list of SPLAC records that could be the place of the event, the data entry person would pick from the list the proper Place Description and Time Frame, for example Christiania in 1850.
Oslo Fylke is the next highest district from the City of Oslo, in English it could be classified as a “county” but if I use that term in the USA they would not understand, and a Norwegian might correct me.

@albertemmerich
Copy link
Collaborator

Are we mixing now GEDCOM data transfer and internal application processes?
In my application the search for "Paris" gives 34 results, listed to select. If I select the Paris in Lamar County, Texas, with on click the place records for the city of Paris, the county Lamar, and the state Texas are created (if not yet in the database), and linked as Paris => Lamar => Texas.
The application has 440 default locations linked to the Lamar County. The user can add any other location he finds in the sources within Lamar County, and link it, too. So it is straight forward to look for all events of individuals / families which took place in Lamar County or any of the locations linked to Lamar County. This part uses the linkage of locations to Lamar County (including linkages with more steps, so buildings, cemeteries etc are included). An individual buried at Baxterville Cemetery will be found by this search as well as any individual born, baptised, etc in Paris.
So for me: yes, creating the first SPLAC record in any place in Lamar County will create the Lamar County record, too. However, the next SPLAC record of another place im Lamar County already can use the Lamar County record, and link to it. This includes all places in a level under cities, too. So I can create a SPLAC record for any farm, link it to a village in Lamar County or directly to the county. If I had done this, and transfer the data to next database (maybe next application) I do not want to create these links again and again.

@Norwegian-Sardines
Copy link

Norwegian-Sardines commented Oct 4, 2024

How many of those links/events are in

The application has 440 default locations linked to the Lamar County.

Could some of them be in Lamar County, Mississippi?

@Norwegian-Sardines
Copy link

So for me: yes, creating the first SPLAC record in any place in Lamar County will create the Lamar County record, too. However, the next SPLAC record of another place im Lamar County already can use the Lamar County record, and link to it.

My point here is that Lamar County (and potentially any place entity) is not unique within the GOV database, it exists within the context of its higher place entities. In this case the State of Texas, the State of Mississippi, or the State of Georgia, So using the record instance named "Lamar" or "Lamar County" as the entry point to find all events that occurred in that county must also identify the State within the United States. And of course, as stated previously, Lamar was also in the Country of "The Republic of Texas". So this is why the hierarchy of different record instances has issues, and they are compounded if you want to include historical place names, place names by language, and place names that are not unique in the world due to their contextual connections to other place entities.

@albertemmerich
Copy link
Collaborator

Lamar County in Texas has 133 locations in the default database. You are correct, we have to pick this SPLAC record for the County in Texas when searching. My application does it automatically, if you choose it as parent SPLAC record. If you only look for the name "Lamar County", you get the places in other states of the US, too. But that is shown in the result, as the search result will list the name of location, its type, its county, its state, and the country as well as the GPS coordinates.

By the way, do you know an external database which shows all the historic links in US? GOV could do that, if somebody enters the data. GOV is based on the community of researchers cooperating to enter all these data...

@Norwegian-Sardines
Copy link

By the way, do you know an external database which shows all the historic links in US? GOV could do that, if somebody enters the data.

No I don’t know of an external DB of historical links for the USA. Sorry, I have enough to do, can’t use up my time entering data for other people!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants