0000320: efun to remove double elements in array - MantisBT

ID	Project	Category	View Status	Date Submitted	Last Update

0000320	LDMud 3.3	LPC Language	public	2004-12-06 09:05	2011-02-23 22:02

Reporter	_xtian_	Assigned To
Priority	normal	Severity	feature	Reproducibility	N/A
Status	new	Resolution	open
Target Version	3.3.721

Summary	0000320: efun to remove double elements in array
Description	An Efun to remove double (triple, ...) elements in an array, without the need to go throug m_indices(mkmapping( my_array )); example (let's call it "mkset"): mkset( ({1,2,3,1}) ) == ({1,2,3})
Tags	ldmud-extensions

zesstra 2009-03-03 15:06 administrator ~0000967	Mhmmm. The questions is: how would we implement it internally? If it boils down to do m_indices(mkmapping( my_array )), is then an efun needed or may be just an sefun?

_xtian_ 2009-03-26 05:04 reporter ~0001006	First it would be nice to have this in the LPC language since this is a common operation - just, so that everybody (in different MUDs) uses the same vocabulary. Second, and Im not familiar with the ldmud internal implementational details, what I thought of was the overhead in the LPC implementation - especially for very large Arrays. But the stronger argument is the first one.

zesstra 2009-04-13 12:49 administrator ~0001022	I did some preliminary tests in my homemud concerning the costs of m_indices(mkmapping)). Starting point was an array with x elements which are initialised to random(y), so that after unification the array (and the intermediate mapping) size should be y. Array size (x) \| Unique elements (y) \| Ticks \| Evaltime 10000 \| 10 \| 7 \| <1 ms 100000 \| 10 \| 7 \| 0000005:0000010 ms 1000000 \| 10 \| 7 \| 0000068:0000070 ms Same procedure with random(ARRAY_SIZE/10): Array size (x) \| Unique elements (y) \| Ticks \| Evaltime 10000 \| 1000 \| 7 \| 0000001:0000001 ms 100000 \| 10000 \| 7 \| 0000007:0000015 ms 1000000 \| 100000 \| 7 \| 0000364:0000340 ms I would say, if your arrays are not bigger than 100k elements and if you don't unify one very often, the current approach is reasonably fast. Possibilities: a) create a default sefun using mkmapping() and m_indices() and recommending to use it. b) make an efun mkset() to be a plain alias for m_indices(mkmapping(a)). c) create mkset() and optimize a little bit for the use case (mkmapping() can deal with structs and create mappings with values, which is not needed here. But I guess, we won't save much. d) don't use an intermediate mapping but something else (e.g. a plain hash table), which saves some overhead. Of course, the amount of work increases from a to d. ;-) Any comments/preferences?

zesstra 2009-04-13 12:51 administrator ~0001023	Gna. Sometimes I hate auto-magic-replace-stuff. Ok, the table again without any ~ characters: Array size (x) \| Unique elements (y) \| Ticks \| Evaltime 10000 \| 10 \| 7 \| <1 ms 100000 \| 10 \| 7 \| 10 ms 1000000 \| 10 \| 7 \| 70 ms Same procedure with random(ARRAY_SIZE/10): Array size (x) \| Unique elements (y) \| Ticks \| Evaltime 10000 \| 1000 \| 7 \| 1 ms 100000 \| 10000 \| 7 \| 15 ms 1000000 \| 100000 \| 7 \| 340 ms

bubbs 2009-04-13 14:51 reporter ~0001024	I don't think m_indices(mkmapping(arr)) give a satisfactory answer, since it does not maintain the order of the original array.

zesstra 2009-04-16 14:11 administrator ~0001049	Ok, until now there wasn't a requirement to maintain the order of the array by Xtian. (Which is strictly speaking not even possible, if we remove elements from the array.) Spontaneous thought: as long as you (can) sort the array, maintaining the order is not really important. But if that is impossible and you need the elements in the order you added them to the array, then probably you prevent the addition of double elements in the first place.

Coogan 2009-11-02 18:10 reporter ~0001589	This requirement sounds like a kind of unique_array() in its simplest form without any arguments: unique_array(({1,2,3,2,1}) --> ({1,2,3}) unique_array() could be extended as described here and accept arrays of some kinds (int, object, string*) to strip away multiple elements, returning a simple array.

zesstra 2009-11-03 04:51 administrator ~0001590	Just to be clear: despite what the name suggests: this efun does not remove multiple elements from the result array, it just groups them together according to the value of the evaluated function/closure.

Date Modified	Username	Field	Change
2004-12-06 09:05	_xtian_	New Issue
2009-03-03 15:06	zesstra	Note Added: 0000967
2009-03-26 05:04	_xtian_	Note Added: 0001006
2009-04-13 12:49	zesstra	Note Added: 0001022
2009-04-13 12:51	zesstra	Note Added: 0001023
2009-04-13 14:51	bubbs	Note Added: 0001024
2009-04-16 14:11	zesstra	Note Added: 0001049
2009-11-02 18:10	Coogan	Note Added: 0001589
2009-11-03 04:51	zesstra	Note Added: 0001590
2010-02-14 03:32	zesstra	Tag Attached: ldmud-extensions
2011-02-23 22:02	zesstra	Target Version	=> 3.3.721