View Issue Details

IDProjectCategoryView StatusLast Update
0000699LDMud 3.6Generalpublic2020-09-01 22:35
ReporterBardioc Assigned ToGnomi  
PrioritynormalSeverityfeatureReproducibilityN/A
Status resolvedResolutionfixed 
Fixed in Version3.6.3 
Summary0000699: save/restore of structs
DescriptionThe current implementation to save/restore structs is at least to say dangerous to use at all.

In LPC structs are internally implemented as arrays and saved in the way that their definition position (program name) comes first, followed by the elements ordered in the way of occurrence in the structure.

This is ambiguous in many ways:

1. what happens in case the struct is changed and a member is added not to the end, but in the middle or front of the structure?

A restore will cause a severe problem as wrong values are restored to wrong members!!!!!

2. what happens in case a struct hierarchy is build and a member is added to the super-type. Where is this member placed in the save file?

This is very dangerous, as it might be forgotten that the subtype and thus a save value of it changes too.

Suggestions:

Change the way structs are saved (implies a new save format)

Solution 1: Include the struct declarations in a separate part of the save-file and use the indices to restore the struct based on this additional information. At places where the struct is saved, positions relative to the declaration can be used. Compare saved declaration with current declaration and initialize all found members. Initialize all not found members with 0

Solution 2: Save the name of the member together with its value. Initialize the member if found in the current declaration, ignore if not found. Initialize all others with 0.

The second solution would cause the save files to become bigger as member names have to be saved at every occurrence.

ADDITION:

When thinking of the before mentioned problem of added struct members in between or in super types, the possibility to access members by index or initialization of structs by index should be forbidden as it might lead to hard to find errors. Imagine the following example:

struct foo
{
     string x;
};

struct baa (foo)
{
     string y;
};

struct baa mystruct = (<baa> "a", "b");

printf("x = %O, y = %O\n", mystruct->x, mystruct->y);
printf("x = %O, y = %O\n", mystruct->(0), mystruct->(1));

Exam question: what is the result?
Solution:

x = "a", y = "b"
x = "a", y = "b"

So what happens if some wizard now changes the struct foo in this way:

struct foo
{
     string x;
     string z;
}

Question: What is the result now?

x = "a", y = 0
x = "a", y = "b" <== this line is not telling the truth ... z = "b"

See the problem?

Because of all this i would strongly suggest to REMOVE the indexing by position as well as the initialization by position. I now this might break code, but this behavior is simply too dangerous to be permitted.

Any comments are welcome. Lets start a discussion here :-)
TagsNo tags attached.

Relationships

related to 0000698 resolvedGnomi Make different versions of a struct compatible to each other 

Activities

zesstra

2009-11-02 15:10

administrator   ~0001582

I discussed this briefly with Bardioc in private and I concur with this.
The issue with saving/restoring must definitely be addressed.
In addition to the names of the members we should also save the types of the members. Otherwise a change from struct a_s {int x;}; to struct a_s {mapping x;}; is not recognized. It may be bad style to do such a thing, but I think, it is our responsibility to check for this.

Concerning the indexing with numbers: We could assume, that every wizard has to right to shot himself in the foot... But I tend to agree with Bardioc. We might change remove it only in 3.5, so there will be no severe change in language during 3.3...

_xtian_

2009-11-02 16:43

reporter   ~0001584

Access through the numbers is useful to iterate through the struct elements.

I agree that it invites error prone code, but it also may be used to do something like:

// initialize all elements
for(i;i<sizeof(s);i++)
  s->(i)= ...;

I have already used it for similar things.

zesstra

2009-11-02 17:42

administrator   ~0001585

I also thought of that way to use this notation. As a comment, a simple way to achieve that result without indexing the struct itself would be to use to_struct() and to_array(). Both should be fairly efficient (because internally structs and arrays are so similar).
So for initialization one could use:
struct a_s a = to_struct(m_allocate(x,42.7), (<a_s>));
Or populate an array with individual values. Depending on the exact conditions, this _might_ be even faster than a for loop in LPC.
Of course, both is just as dangerous as the for loop. ;-)

On the other hand, if we keep indexing by numbers, there should be a HUGE warning sign in the documentation, both about indexing and struct literals.

Gnomi

2009-11-04 04:45

manager   ~0001597

Concerning the type of the members: I think that this should depend on the RTTC pragma. It's the goal that assignments to variables have their type checked at runtime as well and exactly the same should restore_object do with normal variables and struct members.

zesstra

2009-11-04 18:32

administrator   ~0001603

Ultimately, I agree with Gnomi.
I added checks for the types of restored variables and members of structs in restore_object()/restore_struct() in 3.5. (trunk, r2798+r2799).

Concerning savefile formats: I would prefer a format which stores the struct definition separately from the variables, because it is more compact. We can write the definition before the first struct variable using it.
Interesting questions here are: could such a definition be used to create a struct type if there is none? The use of such structs would be limited, but if such a struct inherits from a base struct still existing, it might be useful. But what if a program defining the correct struct type is loaded later...?

zesstra

2009-11-04 18:47

administrator   ~0001604

Ah, BTW: concerning the types there are two things:
a) check upon restore the correct types. This should depend on RTTC.
b) storing the types of members in the savefile to distinguish between two structs (old in the savefile, new one in the program) having members with the same name but different types. Here I am not so sure. So if the first a_s is in the savefile and the second in the program, the struct type from the savefile could be just regarded as unknown (and we don't restore unknown structs right now...):
struct a_s {int x};
struct a_s {string x};

_xtian_

2009-11-05 16:12

reporter   ~0001610

Correct me if I misunterstood, but the only legal form af a struct should be the one that has been compiled to date (the time of the restore). If a save-file containes a struct-definition it should only be used to compare it to the target format of the struct and to possibly convert it. The data in the save-file should not generate a new struct sub-type that has not been declared in running code.
Else strange old struct definitions could always crop up and lead to unclear errors that are not reproducible.

I even think that restoring to an unknown struct-type should throw an error because it is unpredictable what would happen next.

Also concerning the last note: Doesn't b) ==> a) ?

zesstra

2009-11-05 16:38

administrator   ~0001612

a) just means that we check if the member of the current struct is allowed to hold the type in the savefile. That you can do without having the struct definition saved.

b) is to compare the types of members of the current struct and the struct in the savefile (so the struct at the time of save_svalue()) to find structs having the same members but different types.

zesstra

2010-02-16 16:17

administrator   ~0001735

I think we should discuss a reasonable savefile format to store the struct definitions at the time of the save_svalue()...
As I said I would prefer to store the struct template only once in the savefile and not for every occurrence of a struct of that type. In that case we need to store the definitions before any variables using them, I would just argue for putting it before any variables.
But we also run into the problem of nested structs and struct inheritance and how to represent that. We might first store a list of structs referenced in the savefile and after that a block with struct definitions. In case of inheritance chains both lists may be sorted base-struct-first.

_xtian_

2010-02-16 18:36

reporter   ~0001738

There is also save/restore_value() to consider. As with save/restore_object() the question is, if it could be used to generate struct definitions at runtime - in my last comment I made my argument against that.

Maybe the easiest way to change the savefile format would be to dictate declared-where-first-used rule. In a one value format like save_value() this would change nothing. In a save_object() szenario the first saved definition of the struct would be used for all following ones with the same name. My two cents.

zesstra

2010-02-17 05:18

administrator   ~0001739

Yes, I agree with argument against generating struct definition upon restoring structs.

Ok, we can also mix the definition between the variables. But I think it might make things more complicated: if a mapping/array/struct contains a struct we have to write the struct definition before the variable which contains them. And if we write the defs when we reach a struct in a mapping, we already wrote a part of the mapping to the buffer. But IMHO we should not mix variables contents with 'inline' struct definition. A solution might be to write the defs to a different buffer and concatenate them in the end. But because we anyway iterate two times over the variables of an object and we anyway can't write the defs within a save_svalue() (because then the variable name was already written) we might also write all used struct defs in the beginning between header and variables...

BTW: I initially thought we don't have to care especially about inherited ones, because the members are copied to the inheritee, but then a struct might be restored which is incompatible to its (current) base struct.

BTW2: Changing the savefile format might cause problems with users of save_value(), depending on how the use it (e.g. rely on header+content are always only 2 lines), but I guess not too many use save_value().

If we read a struct definition during restore, we can directly compare the definition with the current definition in the system (if any) and abort the restore if they don't match and are not upgradable... (The other possible behaviour is to silently ignore any variables without matching struct definition, but I don't like that too much.)

Upgrading structs:
Reasonable if the current struct definition has more members or members in different sequence.
If member types changed, I would not regard them as upgradable (exception: change of 'mixed' to specific type).
If the current struct contains less members: abort the restore, ignore the struct or ignore the non-existing members? Do we issue a runtime warning?

_xtian_

2010-02-17 17:42

reporter   ~0001740

A valid reason for removing a struct element could be that you realized that you never really needed that element. Sometimes interfaces and data structures get smaller when they evolve. So a vanished element should not terminate the restoration of the other values. As for issuing a warning - I'm not sure.

Gnomi

2015-08-16 18:35

manager   ~0002262

I wouldn't go that far to include a full struct definition. It's not relevant.

All we want is to put the saved values back into a struct in the most sensible way. It doesn't matter if the current struct has other base structs than the saved struct, as long as all the members are there. Also if earlier the member was int|string and is now only int, that is not an obstacle as long as the saved value is an integer.

So for me are the RTT checks enough. If the concrete value type in the savefile matches the currently declared member type, that's good enough. Comparing the earlier definitions with the current one is just an additional impediment that hinders transitions.

That's why I would stick with the problem at hand: Added/removed members and changed order of them. For this we just need the name of the members at the time of saving (and the name of the struct). And there are two problems to discuss:

1. How to format this information?
I don't have a strong opinion on this one. Because save_value(({(<s>)})) must also work (and most programs expect save_value to return exactly two lines), I think it would be best to be inlined with the content.
Something like (<"struct_name filename.c 0000001 member1,member2,member3">,1,2,3)
where all future occurrences can have (<"struct_name filename.c 0000001">,5,6,7).
The 0000001 would be a savefile-internal counter and not the struct id that it is now (because the structname/filename is not enough, there might be two different versions of a struct of the same name floating around).

2. What to do with vanished members?
Because restore_objects also ignores vanished variables, I would do the same with vanished members. Ignore silently. Otherwise the removal of a member would be a big pain (you have to modify all savefiles *by hand* or give the struct a new name and load/copy to new struct/save for each file).

zesstra

2020-08-25 20:39

administrator   ~0002551

Uh, long forgotten...
After re-thinking about this, I think, I agree with Gnomi so far.
Concerning the format: I also don't have a strong opinion about it and my suggestion is to choose whichever has advantages for the implementation of save_*/restore_*. Both variants are compact concerning the savefile size, but the inline variant might save a loop. It complicates reading the savefile by a human, but not too much.

Gnomi

2020-08-25 20:42

manager   ~0002552

This is implemented in https://github.com/amotzkau/ldmud/commit/bebb18e80164d5f1f5069aff51e60b7382a5267a.

Issue History

Date Modified Username Field Change
2009-11-02 14:58 Bardioc New Issue
2009-11-02 15:10 zesstra Note Added: 0001582
2009-11-02 16:20 zesstra Relationship added related to 0000698
2009-11-02 16:43 _xtian_ Note Added: 0001584
2009-11-02 17:42 zesstra Note Added: 0001585
2009-11-04 04:45 Gnomi Note Added: 0001597
2009-11-04 18:24 zesstra Project LDMud => LDMud 3.5
2009-11-04 18:32 zesstra Note Added: 0001603
2009-11-04 18:47 zesstra Note Added: 0001604
2009-11-05 16:12 _xtian_ Note Added: 0001610
2009-11-05 16:38 zesstra Note Added: 0001612
2010-02-16 16:17 zesstra Note Added: 0001735
2010-02-16 18:36 _xtian_ Note Added: 0001738
2010-02-17 05:18 zesstra Note Added: 0001739
2010-02-17 17:42 _xtian_ Note Added: 0001740
2010-03-27 06:22 zesstra Target Version => 3.5.0
2015-08-16 18:35 Gnomi Note Added: 0002262
2017-10-04 21:26 zesstra Product Version 3.3 =>
2017-10-04 21:26 zesstra Target Version 3.5.0 => 3.5.1
2020-08-24 18:56 Gnomi Assigned To => Gnomi
2020-08-24 18:56 Gnomi Status new => assigned
2020-08-25 20:39 zesstra Note Added: 0002551
2020-08-25 20:42 Gnomi Note Added: 0002552
2020-09-01 22:34 Gnomi Project LDMud 3.5 => LDMud 3.6
2020-09-01 22:34 Gnomi Category Implementation => General
2020-09-01 22:35 Gnomi Status assigned => resolved
2020-09-01 22:35 Gnomi Resolution open => fixed
2020-09-01 22:35 Gnomi Fixed in Version => 3.6.3