View Issue Details

IDProjectCategoryView StatusLast Update
0000759LDMud 3.2Runtimepublic2010-10-16 12:27
ReporterCoogan Assigned ToGnomi  
PrioritynormalSeveritycrashReproducibilityunable to reproduce
Status resolvedResolutionfixed 
Product Version3.2.16 
Summary0000759: segmentation fault
DescriptionIn Tubmud I had a crasher today, the first since over a year.
Unfortunately, after an uptime of almost one year I had to recompile the driver because of lib updates which meanwhile took place. At recompiling I accidentally switched from 3.2-dev.724 to 3.2.16, so I don't know if this crasher is the same as it has been fixed in 0000622.
Additional InformationThe backtrace of gdb says:

Core was generated by `/home/tubmud/mudbin/driver -Mkernel/master -DTUBMUD=1 -D__START_TIME__="_S_T_20'.
Program terminated with signal 11, Segmentation fault.
[New process 21901]
#0 large_free (ptr=0xaf50428 "ìž\003\n\001\001\006R\001\006q")
    at smalloc.c:2096
2096 *(ptr+size-1) = size; /* copy the size information */
(gdb) bt
#\0 large_free (ptr=0xaf50428 "ìž\003\n\001\001\006R\001\006q")
    at smalloc.c:2096
#\1 0x080dc0dc in xfree (ptr=0xaf50428) at smalloc.c:1245
#\2 0x080dd063 in rexalloc_traced (p=0xaf47a88, size=8192) at smalloc.c:3051
#\3 0x080b4b6c in realloc_mem_block (mbp=0x812c200, size=-1895365628)
    at prolang.y:911
#\4 0x080bcdd8 in yyparse () at lang.y:7316
#\5 0x080c3d3a in compile_file (fd=15) at prolang.y:12838
#\6 0x080cc258 in load_object (lname=<value optimized out>, create_super=0,
    depth=0, chain=0x0) at simulate.c:1876
#\7 0x080cc953 in lookfor_object (
    str=0xa19d5c0 "domains/glandon/npc/voittout", bLoad=1) at simulate.c:2289
#\8 0x08082fe8 in eval_instruction (first_instruction=0x97f2823 "\033",
    initial_sp=0x81056b8) at interpret.c:16704
#\9 0x0809ad02 in apply_low (fun=0x96cdd2c "reset", ob=0x9e2b654, num_arg=1,
    b_ign_prot=0) at interpret.c:21684
#\10 0x08089bb2 in eval_instruction (
    first_instruction=0xbfbcb157 "\204\003\026", initial_sp=0x81056b8)
    at interpret.c:15753
#\11 0x08081c85 in call_lambda (lsvp=0x81056a0, num_arg=3) at interpret.c:22774
#\12 0x0808d6f8 in eval_instruction (
    first_instruction=0x96dcc13 "R\003\003\016\002(\a\b\2350X\v|\024\204ð",
    initial_sp=0x8105698) at interpret.c:15929
#\13 0x0809ad02 in apply_low (fun=0x9515c84 "call_other", ob=0x807b6f5,
    num_arg=3, b_ign_prot=0) at interpret.c:21684
#\14 0x0809afa9 in call_simul_efun (code=<value optimized out>, ob=0x8f070404,
    num_arg=3) at interpret.c:22879
#\15 0x08082202 in call_lambda (lsvp=0x8105668, num_arg=3) at interpret.c:22805
#\16 0x0808d6f8 in eval_instruction (
    first_instruction=0xae81203 "\016\020\0044^\021GN\\ ",
    initial_sp=0x81055b8) at interpret.c:15929
#\17 0x08081f18 in call_lambda (lsvp=0x81055b0, num_arg=1) at interpret.c:22424
#\18 0x0808d6f8 in eval_instruction (first_instruction=0xae845b7 "\\\024",
    initial_sp=0x81055a8) at interpret.c:15929
#\19 0x0809ad02 in apply_low (fun=0xa7b7250 "call_command", ob=0x97f1e10,
    num_arg=1, b_ign_prot=0) at interpret.c:21684
#\20 0x0809b05d in sapply_int (fun=0xa7b7250 "call_command", ob=0x97f1e10,
    num_arg=1, b_find_static=0) at interpret.c:21796
#\21 0x0804cd97 in parse_command (buff=0xbfbcd0b6 "Call here reset 1",
---Type <return> to continue, or q <return> to quit---
    from_efun=0) at actions.c:1082
#\22 0x0804e4e3 in execute_command (str=0xbfbcd0b6 "Call here reset 1",
    ob=0xa6d7ef8) at actions.c:1238
#\23 0x08055f93 in backend () at backend.c:601
#\24 0x080a7aa1 in main (argc=14, argv=0xbfbceb94) at main.c:517
(gdb) q
TagsNo tags attached.



2010-10-01 08:17

administrator   ~0001900

Without the core, corresponding binary and source it will be nearly impossible to analyze the cause of the crash. It looks like an incorrect size information. But line 2096 is in build_block() instead of large_free (at least in 3.2.16). Maybe an issue with showing inlined functions in gdb.
The interesting question is, if this issue still exists in 3.3.x or 3.5.x. ;-)


2010-10-02 11:36

reporter   ~0001901

This is more a 3.2.17 than a plain 3.2.16, as the fix for 0000677 is mentioned in the CHANGELOG. In the src-directory, the following .c-files have a newer timestamp than original 3.2.16:
-rw-r--r-- 1 tubmud tubmud 130713 2009-10-05 00:28 make_func.c
-rw-r--r-- 1 tubmud tubmud 38245 2009-10-06 09:34 pkg-tls.c
-rw-r--r-- 1 tubmud tubmud 54494 2009-10-06 10:15 efun_defs.c
-rw-r--r-- 1 tubmud tubmud 3303 2009-10-06 10:16 stdstrings.c
-rw-r--r-- 1 tubmud tubmud 522080 2009-10-06 10:16 lang.c


2010-10-02 11:47

reporter   ~0001902

The bzip2-ed core file is available under:


2010-10-16 10:28

manager   ~0001903

Zesstra is correct in that the size information of a memory block is corrupted (the higher half word was changed from 0xf000 to 0x8f07, the lower half seems to be correct, the bytes before and after that seem also okay).

But I have no idea how it could be changed in such a way.

If this happens more often, I'd recommend compiling with MALLOC_TRACE (which uses magic bytes for checking for correct pointers and gives information on the type of each memory block).


2010-10-16 11:37

manager   ~0001904

Coogan, could you post the instrs.h from the sources?


2010-10-16 12:07

manager   ~0001905

Never mind, I found a bug in the compiler with "call_other" simul-efuns, that corrupted the memory block afterwards. Maybe that resulted in the corruption of the program block itself. I'll commit a fix to 3.2 trunk. It does not affect 3.3 or 3.5.


2010-10-16 12:27

manager   ~0001906

Fix committed as r2932.

Issue History

Date Modified Username Field Change
2010-09-30 19:02 Coogan New Issue
2010-10-01 08:17 zesstra Note Added: 0001900
2010-10-02 11:36 Coogan Note Added: 0001901
2010-10-02 11:47 Coogan Note Added: 0001902
2010-10-16 10:28 Gnomi Note Added: 0001903
2010-10-16 11:37 Gnomi Note Added: 0001904
2010-10-16 12:07 Gnomi Note Added: 0001905
2010-10-16 12:27 Gnomi Note Added: 0001906
2010-10-16 12:27 Gnomi Status new => resolved
2010-10-16 12:27 Gnomi Resolution open => fixed
2010-10-16 12:27 Gnomi Assigned To => Gnomi