comparing integer lists

I have two small arrays of integers. (I don’t expect more than 16 elements in each array) No integer occurs more than once in an array. Integers are sorted from least to greatest, though they are not necessarily consecutive.

What’s an efficient way to diff them, so I get the elements in each array that aren’t in the other?
(I do want two resulting arrays, for the elements that were only in array 1, and only in array 2)

The simple way I thought of was to iterate through the first away searching for matching elements in the second array. If one is found, both elements are deleted. The remaining arrays are the elements that were only in each array.

Is this LabView, Java or C++??

In Java (the only language I know), I would use nested for loops that would compare an element in the first array wiht each element in the second array. An example is below, where any number that is found in the other array is changed to -1.


    public class ArraySort
   {
       public static void main(String]args)
      {
         int] num1={1,3,4,5,6,8,10,15,19,20,55,99,100,115,156,205};
         int] num2={0,2,3,4,6,7,10,19,20,45,55,65,100,156,200,256};
      
      //check for equal variables, if one is found, set element to -1
         for(int x=0; x<num1.length; x++)
         {
            for(int a=0; a<num2.length; a++)
               if(num1[x]==num2[a])
               {
                  num1[x]=-1;
                  num2[a]=-1;
               }
         }
         	//print the arrays
         for(int x=0; x<num1.length; x++)
            System.out.print(num1[x]+" ");
      
         System.out.println();
            	      
         for(int x=0; x<num2.length; x++)
            System.out.print(num2[x]+" ");
      }
   }

Say the arrays are labled ‘A’ and ‘B’. Just two pointers ‘a’ and ‘b’ (pointing in their respective arrays). Without loss of generality, say that ‘B’ starts with a larger element. Keep on advancing ‘a’ until *a < *b is no longer true. As these arrays are sorted, these values must be unique to A. Now advance ‘b’ in a similar manner. In the case where *a == *b, advance both pointers. Don’t output the element as it exists in both.

Alternately, add all elements of A into a hashtable. Now iterate through B outputting all elements that are not in the hashtable.

Both algorithms have the same asympotic complexity but the first doesn’t require any extra memory and should run a bit faster.

Here is algorithm 1


#include <stdio.h>
#include <stdlib.h>

void mydiff(int * arr1, int * end1,  int * arr2, int * end2, int * diff1, int* diff2);

int main() {
	int arr1] = {2, 4, 5, 6, 7, 8, 20, 33, 34, 56};
	int arr2] = {1, 2, 6, 7, 8, 10, 11, 45, 98};
	
	int * diff1 = calloc(sizeof(int), 10);
	int * diff2 = calloc(sizeof(int), 9);

	mydiff(arr1, arr1 + 10, arr2, arr2 + 9, diff1, diff2);
	
	printf("diff in arr1:
");
	while(*diff1) {
		printf("%d ", *diff1);
		diff1++;
	}
	printf("
diff in arr2:
");
	while(*diff2) {
		printf("%d ", *diff2);
		diff2++;
	}
	
	
}


void getTail(int * arr, int * end, int * diff) {
	while(arr != end) {
		*diff = *arr;
		diff++;
		arr++;
	}
}

void mydiff(int * arr1, int * end1,  int * arr2, int * end2, int * diff1, int * diff2) {
	
	if(arr1 == end1) {
		getTail(arr2, end2, diff2);
		return;
	}
		
	if(arr2 == end2) {
		getTail(arr1, end1, diff1);
		return;
	}
	
	while(*arr1 < *arr2 && arr1 != end1) {
		*diff1 = *arr1;
		diff1++;
		arr1++;
	}
	
	while(*arr2 < *arr1 && arr2 != end2) {
		*diff2 = *arr2;
		diff2++;
		arr2++;
	}
	
	if(*arr1 == *arr2) {
		if(arr1 != end1)
			arr1++;
		if(arr2 != end2)
			arr2++;
	}
	
	mydiff(arr1, end1, arr2, end2, diff1, diff2);
}


With only 16 elements, the major factor in performance is going to be the implementation details rather than the actual algorithm so the following is more of an academic exercise. You probably won’t notice any kind of difference in run time.

Leveraging the facts that a) the arrays are already sorted and b) there are no repetitions in the individual arrays, you can use a stack-matching algorithm. This will run in O(max(n,m)) time instead of the naive methods which will run O(n*m) for arrays sizes n and m. Here’s implementations in C and Java.


#include <stdlib.h>
#include <stdio.h>

/* Selects the unique elements between two arrays
 *
 * in1 is the first array
 * in1end is the pointer to past the end of the first array
 * in2 is the second array
 * in2end is the pointer to past the end of the second array
 * out1 is the first output array
 * out2 is the second output array
 *
 * returns the number of common elements, so
 * length(out1) = length(in1) - result;
 * length(out2) = length(in2) - result;
 */
int unique(register int *in1, int *in1end, int *in2, int *in2end, int *out1, int *out2)
{
	register int common = 0;
	register int val1 = *in1, val2 = *in2;
	for(;;)
	{
		if (val1 < val2)
		{
			*(out1++) = val1;
			if (++in1 > in1end)
				break;
			val1 = *in1;
		}
		else if (val1 > val2)
		{
			*(out2++) = val2;
			if (++in2 > in2end)
				break;
			val2 = *in2;
		}
		else
		{
			common++;
			if ((++in1 > in1end) || (++in2 > in2end))
				break;
			val1 = *in1;
			val2 = *in2;
		}
	}
	
	while(in1 < in1end)
		*(out1++)=*(in1++);
	while(in2 < in2end)
		*(out2++)=*(in2++);
	
	return common;
}

int main(int argc, char *argv])
{
	int *arr1, *arr2, *out1, *out2;
	int i, common, len1, len2;
	
	if (argc!=3)
	{
		printf("Usage: %s <first length> <second length>
", argv[0]);
		return 1;
	}
	
	/* read lengths */
	len1 = atoi(argv[1]);
	len2 = atoi(argv[2]);
	
	/* allocate array space */
	arr1 = malloc(len1*sizeof(int));
	out1 = malloc(len1*sizeof(int));
	arr2 = malloc(len2*sizeof(int));
	out2 = malloc(len2*sizeof(int));
	
	/* create two arrays of strictly increasing elements */
	srand ( time(NULL) );

	arr1[0] = rand() % 3;
	for(i=1; i < len1; i++)
		arr1* = arr1* + (rand()%3)+1;
		
	arr2[0] = rand() % 3;
	for(i=1; i < len2; i++)
		arr2* = arr2* + (rand()%3)+1;
	
	/* print initial contents of the arrays */
	printf("
Array 1
");
	for(i=0; i < len1; i++)
		printf("%d
", arr1*);
	
	printf("
Array 2
");
	for(i=0; i < len2; i++)
		printf("%d
", arr2*);
	
	/* run algorithm */
	common = unique(arr1, arr1+len1, arr2, arr2+len2, out1, out2);
	
	/* print results */
	printf("
%d common elements
", common);
	printf("
Array 1
");
	for(i=0; i < len1-common; i++)
		printf("%d
", out1*);
	
	printf("
Array 2
");
	for(i=0; i < len2-common; i++)
		printf("%d
", out2*);
	
	/* free memory */
	free(arr1);
	free(arr2);
	free(out1);
	free(out2);
}


import java.util.Arrays;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.Random;

public class Unique<E extends Comparable<E>>
{
	private ArrayList<E> out1, out2;
	
	public Unique(E] in1, E] in2)
	{
		int len1 = in1.length;
		int len2 = in2.length;
		out1 = new ArrayList<E>(len1);
		out2 = new ArrayList<E>(len2);
		
		Iterator<E> it1 = Arrays.asList(in1).iterator();
		Iterator<E> it2 = Arrays.asList(in2).iterator();
		E cur1 = it1.next();
		E cur2 = it2.next();
		
		for(;;)
		{
			int comp = cur1.compareTo(cur2);
			if (comp < 0)
			{
				out1.add(cur1);
				if (!it1.hasNext())
				{
					out2.add(cur2);
					break;
				}
				cur1 = it1.next();
			}
			else if (comp > 0)
			{
				out2.add(cur2);
				if (!it2.hasNext())
				{
					out1.add(cur1);
					break;
				}
				cur2 = it2.next();
			}
			else
			{
				if (!it1.hasNext() || !it2.hasNext())
					break;
				cur1 = it1.next();
				cur2 = it2.next();
			}
		}
		while(it1.hasNext())
		{
			out1.add(it1.next());
		}
		while(it2.hasNext())
		{
			out2.add(it2.next());
		}
	}
	
	public ArrayList<E> getOut1()
	{
		return out1;
	}
	public ArrayList<E> getOut2()
	{
		return out2;
	}
	
	
	public static void main(String] args)
	{
		if (args.length!=2)
		{
			System.err.println("Usage: java Unique <first length> <second length>");
			return;
		}
	
		/* read lengths */
		int len1 = Integer.parseInt(args[0]);
		int len2 = Integer.parseInt(args[1]);
	
		/* allocate array space */
		Integer] arr1 = new Integer[len1];
		Integer] arr2 = new Integer[len2];
	
		/* create two arrays of strictly increasing elements */
		Random rand = new Random();
		arr1[0] = rand.nextInt(3);
		for(int i=1; i < len1; i++)
			arr1* = arr1* + rand.nextInt(3)+1;
		
		arr2[0] = rand.nextInt(3);
		for(int i=1; i < len2; i++)
			arr2* = arr2* + rand.nextInt(3)+1;
	
		/* print initial contents of the arrays */
		System.out.print("Array 1: ");
		System.out.println(Arrays.toString(arr1));
		System.out.print("Array 2: ");
		System.out.println(Arrays.toString(arr2));
		System.out.println();
	
		/* run algorithm */
		Unique<Integer> u = new Unique<Integer>(arr1, arr2);
	
		/* print results */
		System.out.print("Array 1: ");
		System.out.println(u.getOut1());
		System.out.print("Array 2: ");
		System.out.println(u.getOut2());
	}
}

Theoretically might run faster with a big array, but leads to a little messier code than, say, [buildmaster5000]'s

EDIT: Just read [Chris27]'s post again. I think I’m doing the same thing he is, just entirely iteratively instead of the combination iterative/recursive approach he took.

–Ryan************

My solution isn’t really recursive as a modern compiler will turn the tail recursion into iterative code.

It could be rewritten as


void mydiff(int * arr1, int * end1,  int * arr2, int * end2, int * diff1, int * diff2) {
	
	for(;;) {
		if(arr1 == end1) {
			getTail(arr2, end2, diff2);
			break;
		}

		if(arr2 == end2) {
			getTail(arr1, end1, diff1);
			break;
		}

		while(*arr1 < *arr2 && arr1 != end1) {
			*diff1 = *arr1;
			diff1++;
			arr1++;
		}

		while(*arr2 < *arr1 && arr2 != end2) {
			*diff2 = *arr2;
			diff2++;
			arr2++;
		}

		if(*arr1 == *arr2) {
			if(arr1 != end1)
				arr1++;
			if(arr2 != end2)
				arr2++;
		}
	}
}

You already have some good ideas, and as stated, 16 elements is nothing, but I’ll point out that the task is essentially the same as merging sorted tapes. So now you can look it up in some of the classic algo books if you like.

Greg McKaskle

Let’s brainstorm a short list for fun.

I have Knuth’s 3-volume set here in the library.

I had a copy of Sedgewick years ago but can’t find it. Must have loaned it out without a homing device.