View Issue Details

IDProjectCategoryView StatusLast Update
0000320LDMud 3.3LPC Languagepublic2011-02-23 23:02
Reporter_xtian_ Assigned To 
PrioritynormalSeverityfeatureReproducibilityN/A
Status newResolutionopen 
Target Version3.3.721 
Summary0000320: efun to remove double elements in array
DescriptionAn Efun to remove double (triple, ...) elements in an array, without the need to go throug m_indices(mkmapping( my_array ));

example (let's call it "mkset"):

mkset( ({1,2,3,1}) ) == ({1,2,3})
Tagsldmud-extensions

Activities

zesstra

2009-03-03 16:06

administrator   ~0000967

Mhmmm. The questions is: how would we implement it internally? If it boils down to do m_indices(mkmapping( my_array )), is then an efun needed or may be just an sefun?

_xtian_

2009-03-26 06:04

reporter   ~0001006

First it would be nice to have this in the LPC language since this is a common operation - just, so that everybody (in different MUDs) uses the same vocabulary.

Second, and Im not familiar with the ldmud internal implementational details, what I thought of was the overhead in the LPC implementation - especially for very large Arrays.

But the stronger argument is the first one.

zesstra

2009-04-13 14:49

administrator   ~0001022

I did some preliminary tests in my homemud concerning the costs of m_indices(mkmapping)).
Starting point was an array with x elements which are initialised to random(y), so that after unification the array (and the intermediate mapping) size should be y.

Array size (x) | Unique elements (y) | Ticks | Evaltime
   10000 | 10 | 7 | <1 ms
   100000 | 10 | 7 | 0000005:0000010 ms
   1000000 | 10 | 7 | 0000068:0000070 ms

Same procedure with random(ARRAY_SIZE/10):
Array size (x) | Unique elements (y) | Ticks | Evaltime
   10000 | 1000 | 7 | 0000001:0000001 ms
   100000 | 10000 | 7 | 0000007:0000015 ms
   1000000 | 100000 | 7 | 0000364:0000340 ms

I would say, if your arrays are not bigger than 100k elements and if you don't unify one very often, the current approach is reasonably fast.

Possibilities:
a) create a default sefun using mkmapping() and m_indices() and recommending
   to use it.
b) make an efun mkset() to be a plain alias for m_indices(mkmapping(a)).
c) create mkset() and optimize a little bit for the use case (mkmapping()
   can deal with structs and create mappings with values, which is not needed
   here. But I guess, we won't save much.
d) don't use an intermediate mapping but something else (e.g. a plain hash
   table), which saves some overhead.

Of course, the amount of work increases from a to d. ;-) Any comments/preferences?

zesstra

2009-04-13 14:51

administrator   ~0001023

Gna. Sometimes I hate auto-magic-replace-stuff. Ok, the table again without any ~ characters:
Array size (x) | Unique elements (y) | Ticks | Evaltime
   10000 | 10 | 7 | <1 ms
   100000 | 10 | 7 | 10 ms
   1000000 | 10 | 7 | 70 ms

Same procedure with random(ARRAY_SIZE/10):
Array size (x) | Unique elements (y) | Ticks | Evaltime
   10000 | 1000 | 7 | 1 ms
   100000 | 10000 | 7 | 15 ms
   1000000 | 100000 | 7 | 340 ms

bubbs

2009-04-13 16:51

reporter   ~0001024

I don't think m_indices(mkmapping(arr)) give a satisfactory answer, since it does not maintain the order of the original array.

zesstra

2009-04-16 16:11

administrator   ~0001049

Ok, until now there wasn't a requirement to maintain the order of the array by Xtian. (Which is strictly speaking not even possible, if we remove elements from the array.)
Spontaneous thought: as long as you (can) sort the array, maintaining the order is not really important. But if that is impossible and you need the elements in the order you added them to the array, then probably you prevent the addition of double elements in the first place.

Coogan

2009-11-02 19:10

reporter   ~0001589

This requirement sounds like a kind of unique_array() in its simplest form without any arguments:
unique_array(({1,2,3,2,1}) --> ({1,2,3})

unique_array() could be extended as described here and accept arrays of some kinds (int*, object*, string*) to strip away multiple elements, returning a simple array.

zesstra

2009-11-03 05:51

administrator   ~0001590

Just to be clear: despite what the name suggests: this efun does not remove multiple elements from the result array, it just groups them together according to the value of the evaluated function/closure.

Issue History

Date Modified Username Field Change
2004-12-06 10:05 _xtian_ New Issue
2009-03-03 16:06 zesstra Note Added: 0000967
2009-03-26 06:04 _xtian_ Note Added: 0001006
2009-04-13 14:49 zesstra Note Added: 0001022
2009-04-13 14:51 zesstra Note Added: 0001023
2009-04-13 16:51 bubbs Note Added: 0001024
2009-04-16 16:11 zesstra Note Added: 0001049
2009-11-02 19:10 Coogan Note Added: 0001589
2009-11-03 05:51 zesstra Note Added: 0001590
2010-02-14 04:32 zesstra Tag Attached: ldmud-extensions
2011-02-23 23:02 zesstra Target Version => 3.3.721