Core evolution - New Attribute Types & Arrays
From Gephi:Wiki
Student: Martin Škurla
Mentors: Tobias Ivarsson, Neo4j project Mathieu Bastian, Gephi project
Contents |
The need of new types within Attributes API
My first proposal called “Adding support for Neo4j in Gephi v. 3” was mainly about general comparison of Gephi and Neo4j graph model representations, data model and data retrieval proposals. Data retrieval proposal was about wrapping new types into existing ones. It's not a bad idea, but it may be better to create appropriate types and not only “wrappers”. So the real question is how to add new types and simultaneously not pollute the Attribute API. In ideal case decent design, robust and error-free implementation and backward compatibility should work.
The main goal of this document is to propose a good design and implementation of changes in the Attributes API.
Proposal
The proposal consists of two parts. The first is about how to add support for new primitive types and the second is about how to add support for new arrays of primitive, wrapper and other reference types.
Primitive types
From all primitive types, only char type in not directly supported in Gephi. In the first proposal I suggested to convert char into int or String types.
Now I propose to add new appropriate type. The solution is actually very easy and does not require any new class implementation. We can just add new CHAR enum constant with java.lang.Character representation into AttributeType class. There is actually one design issue. Every AttributeType should have possibility to construct instance from String value, because Object parse(String) method from AttributeType class works in this way. The solution is to create a Character instance from first character of the String parameter.
Arrays of primitive, wrapper and other reference types
In the previous proposal, I suggested to convert any array of primitive types into StringList instance and add new attribute column for recognition the real nature of data.
In the following text I will be talking about new array types. There will be a name convention for all these array types. Every class will have name “*List.java”. There must be a way how to differ our List types from java.util.List types. I will be talking about “OurList” as a type representing all “our list implementation types”.
Not only primitive types, but also wrapper types should be supported. It will be useful in the case we have a collection of wrapper types which can be easily converted into an array. It may be useful to have a support for arrays of any reference types which can be added later, so that's why arrays of reference types are also supported. It may also be useful to have a support for numbers with very precise values so the support for arrays of BigInteger will be implemented. Additional support for BigDecimal can be implemented in very similar way.
In fact, internal representation works only with arrays of wrapper and other reference types, so arrays of primitive types have to be converted into arrays of wrapper types.
The StringList type in Attributes API is well defined and implemented. We can inspire in this class and make any other “OurList” type look and work in similar way. Most of the code will be the same for most “OurList” implementations, so ideas from StringList can be used as a common pattern.
The AbstractList class was designed and implemented as the main class which all concrete “OurList” implementations extends. It brings some common functionality for all “OurList” implementations. The NumberList class was designed and implemented for all “OurList” implementations which will represent numbers (primitive types, wrapper types, BigInteger, BigDecimal).
If we want to do all of this in type safe way, the final implementation will be more complex and abstract.
Design & implementation
In this chapter, the design and the implementation will be described. Source codes of AbstractList and NumberList classes and JUnit tests will be showed. Also some refactoring proposals for current implementation will be described.
Concrete implementation
AbstractList<T> class
package org.gephi.data.attributes.type;
import java.lang.reflect.Array;
import java.lang.reflect.Constructor;
import java.lang.reflect.Method;
import java.util.Arrays;
public abstract class AbstractList<T> {
public static final String DEFAULT_SEPARATOR = ",|;";
protected final T[] list;
private volatile int hashCode = 0;
public AbstractList(String value, Class<T> finalType) {
this(value, DEFAULT_SEPARATOR, finalType);
}
public AbstractList(String value, String separator, Class<T> finalType) {
this(AbstractList.<T>parse(value, separator, finalType));
}
public AbstractList(T[] list) {
if (list == null) {
throw new NullPointerException();
}
this.list = Arrays.copyOf(list, list.length);
}
@SuppressWarnings("unchecked")
private static <T> T[] parse(String value, String separator, Class<T> finalType) {
if (value == null || separator == null || finalType == null) {
throw new NullPointerException();
}
assert !finalType.isPrimitive();
String[] stringValueList = value.split(separator);
T[] resultList = (T[]) Array.newInstance(finalType, stringValueList.length);
for (int i = 0; i < stringValueList.length; i++) {
String stringValue = stringValueList[i].trim();
T resultValue = null;
if (finalType == String.class)
resultValue = (T) stringValue;
else
resultValue = AbstractList.<T>createInstance(stringValue, finalType);
resultList[i] = resultValue;
}
return resultList;
}
@SuppressWarnings("unchecked")
private static <T> T createInstance(String value, Class<T> finalType) {
T resultValue = null;
try {
Method method = finalType.getMethod("valueOf", String.class);
resultValue =(T) method.invoke(null, value);
}
catch (NoSuchMethodException e) {
try {
Constructor<T> constructor = finalType.getConstructor(String.class);
resultValue = constructor.newInstance(value);
}
catch (NoSuchMethodException e1) {
throw new IllegalArgumentException("Type '" + finalType + "' does not have either method valueOf(String) or constructor <init> (String)...");
}
catch (Exception e2) {
e.printStackTrace();
}
}
catch (Exception e) {
e.printStackTrace();
}
return resultValue;
}
public int size() {
return list.length;
}
public T getItem(int index) {
if (index >= list.length) {
return null;
}
return list[index];
}
public boolean contains(T value) {
return Arrays.asList(list).contains(value);
}
@Override
public String toString() {
StringBuilder builder = new StringBuilder();
for (int i = 0; i < list.length; i++) {
builder.append(list[i]);
builder.append(',');
}
if (list.length > 0) {
builder.deleteCharAt(builder.length() - 1);
}
return builder.toString();
}
@Override
public boolean equals(Object obj) {
if (!(obj instanceof AbstractList<?>)) {
return false;
}
AbstractList<?> s = (AbstractList<?>) obj;
if (s.size() != this.size()) {
return false;
}
for (int i = 0; i < list.length; i++) {
if (this.getItem(i) != s.getItem(i)) {
if (!this.getItem(i).equals(s.getItem(i))) {
return false;
}
}
}
return true;
}
@Override
public int hashCode() {
if (hashCode == 0) {
int hash = 7;
for (int i = 0; i < list.length; i++) {
hash = 53 * hash + (this.list[i] != null ? this.list[i].hashCode() : 0);
}
hashCode = hash;
}
return hashCode;
}
}
As you can see from the source code:
- this class is abstract and uses generics
- internal representation of data is array of generic type
- there are three ways to create an instance of this class:
- from array of generic type
- from string value with default separator (',' or ';')
- from string value with given separator
- the creation of array from String values is done by 2 private static methods
- object of generic type can be created from String value if it has:
- public static <Type> valueOf(String) method or
- public <init>(String) constructor
NumberList<T extends Number> class
package org.gephi.data.attributes.type;
import java.lang.reflect.Array;
public abstract class NumberList<T extends Number> extends AbstractList<T> {
public NumberList(T[] wrapperArray) {
super(wrapperArray);
}
public NumberList(Object primitiveArray, int arrayLength) {
super(NumberList.<T>parse(primitiveArray, arrayLength));
}
public NumberList(String value, Class<T> finalType) {
this(value, AbstractList.DEFAULT_SEPARATOR, finalType);
}
public NumberList(String value, String separator, Class<T> finalType) {
super(value, separator, finalType);
}
@SuppressWarnings("unchecked")
private static <T extends Number> T[] parse(Object primitiveArray, int arrayLength){
if (primitiveArray == null)
throw new NullPointerException();
Class<T> wrapperClass = (Class<T>) getWrapperClass(primitiveArray);
T[] wrapperArray = (T[]) Array.newInstance(wrapperClass, arrayLength);
if (primitiveArray.getClass().isArray()) {
for (int i = 0; i < arrayLength; i++) {
T arrayItem = (T) Array.get(primitiveArray, i);
wrapperArray[i] = arrayItem;
}
}
else
throw new IllegalArgumentException("Given object is not of primitive array primitiveArray.getClass()");
return wrapperArray;
}
private static Class<?> getWrapperClass(Object primitiveArray) {
Class<?> primitiveArrayType = primitiveArray.getClass().getComponentType();
if (primitiveArrayType == byte.class)
return Byte.class;
else if (primitiveArrayType == short.class)
return Short.class;
else if (primitiveArrayType == int.class)
return Integer.class;
else if (primitiveArrayType == long.class)
return Long.class;
else if (primitiveArrayType == float.class)
return Float.class;
else if (primitiveArrayType == double.class)
return Double.class;
else if (primitiveArrayType == boolean.class)
return Boolean.class;
else if (primitiveArrayType == char.class)
return Character.class;
throw new IllegalArgumentException("Given parameter '" + primitiveArray.getClass() + "' is not array of primitive type...");
}
}
As you can see from the source code:
- this is abstract class and uses generics
- this class extends AbstractList class
- this class accepts only instances of java.lang.Number:
- wrapper types
- BigInteger, BigDecimal
- there is one additional way to create an instance of this class:
- from array of primitive type as object and its length
- the conversion from primitive to wrapper array is done by 2 private static methods
Example “OurList” implementations
Now we can implement our implementations of array types. To create an “OurList” implementation, we have to extend either AbstractList<T> or NumberList<T extends Number> abstract classes.
We can create as many constructors as we want, but in most cases our implementation class will have the same constructors as base class we extend. There are 2 exceptions. The first is when we don't want to have all the functionality from base class (BigIntegerList doesn't support creation from arrays of primitive types) and the second is opposite case (StringList can be created from char[] and IntegerList from byte[]).
AbstractList and NumberList have basic functionality including finding out the number of items and ability to get any item by its index. Any other functionality can be added as additional methods.
Examples of “OurList” implementations are shown below.
package org.gephi.data.attributes.type;
public final class StringList extends AbstractList<String> {
public StringList(char[] list) {
super(StringList.parse(list));
}
public StringList(String[] list) {
super(list);
}
public StringList(String value) {
this(value, AbstractList.DEFAULT_SEPARATOR);
}
public StringList(String value, String separator) {
super(value, separator, String.class);
}
private static String[] parse(char[] list) {
String[] resultList = new String [list.length];
for (int i = 0; i < list.length; i++)
resultList[i] = "" + list[i];
return resultList;
}
public String getString(int index) {
return getItem(index);
}
}
package org.gephi.data.attributes.type;
public final class IntegerList extends NumberList<Integer> {
public IntegerList(byte[] list) {
this(IntegerList.parse(list));
}
public IntegerList(int[] list) {
super(list, list.length);
}
public IntegerList(Integer[] list) {
super(list);
}
public IntegerList(String value) {
this(value, AbstractList.DEFAULT_SEPARATOR);
}
public IntegerList(String value, String separator) {
super(value, separator, Integer.class);
}
private static int[] parse(byte[] list) {
int[] resultList = new int [list.length];
for (int i = 0; i < list.length; i++)
resultList[i] = list[i];
return resultList;
}
}
package org.gephi.data.attributes.type;
import java.math.BigInteger;
public final class BigIntegerList extends NumberList<BigInteger> {
public BigIntegerList(BigInteger[] list) {
super(list);
}
public BigIntegerList(String value) {
this(value, AbstractList.DEFAULT_SEPARATOR);
}
public BigIntegerList(String value, String separator) {
super(value, separator, BigInteger.class);
}
}
AttributeType class
The only class that must be modified is AttributeType class. For every new attribute type, you have to:
- add enum constant
- modify Object parse(String str) method
In current implementation also modifying static AttributeType parse(Object obj) method will be necessary. The method was refactored, so it is not necessary to modify this method any more.
There is one design issue. In order to integrate new types with AttributeType class and its Object parse(String str) method, every “OurList” implementation should have the constructor or the factory method with one String parameter.
package org.gephi.data.attributes.type;
import java.math.BigInteger;
import org.gephi.data.attributes.type.StringList;
import org.gephi.data.attributes.type.IntegerList;
import org.gephi.data.attributes.type.BigIntegerList;
import org.gephi.data.attributes.type.TimeInterval;
public enum AttributeType {
CHAR(Character.class),
FLOAT(Float.class),
DOUBLE(Double.class),
INT(Integer.class),
LONG(Long.class),
BOOLEAN(Boolean.class),
STRING(String.class),
LIST_STRING(StringList.class),
LIST_INTEGER(IntegerList.class),
LIST_BIGINTEGER(BigInteger.class),
TIME_INTERVAL(TimeInterval.class);
private final Class type;
AttributeType(Class type) {
this.type = type;
}
@Override
public String toString() {
return type.getSimpleName();
}
public String getTypeString() {
return super.toString();
}
public Class getType() {
return type;
}
public Object parse(String str) {
switch (this) {
case CHAR:
return new Character(str.charAt(0));
case FLOAT:
return new Float(str);
case DOUBLE:
return new Double(str);
case INT:
return new Integer(str);
case LONG:
return new Long(str);
case BOOLEAN:
return new Boolean(str);
case LIST_STRING:
return new StringList(str);
case LIST_INTEGER:
return new IntegerList(str);
case LIST_BIGINTEGER:
return new BigIntegerList(str);
case TIME_INTERVAL:
return new TimeInterval(str);
}
return str;
}
public static AttributeType parse(Object obj) {
Class<?> c = obj.getClass();
for (AttributeType attributeType : AttributeType.values()) {
if (c.equals(attributeType.getType()))
return attributeType;
}
return null;
}
DataIndex class
The last class we need to modify is DataIndex class. This class is responsible for storing references. All added code is bold and underline. At the end of the class added methods are shown.
package org.gephi.data.attributes.model;
import java.lang.ref.WeakReference;
import java.util.WeakHashMap;
import org.gephi.data.attributes.type.StringList;
import org.gephi.data.attributes.type.IntegerList;
import org.gephi.data.attributes.type.BigIntegerList;
import org.gephi.data.attributes.type.TimeInterval;
public class DataIndex {
private WeakHashMap<Character, WeakReference<Character>> charMap;
private WeakHashMap<Float, WeakReference<Float>> floatMap;
private WeakHashMap<Integer, WeakReference<Integer>> intMap;
private WeakHashMap<String, WeakReference<String>> stringMap;
private WeakHashMap<Boolean, WeakReference<Boolean>> booleanMap;
private WeakHashMap<StringList, WeakReference<StringList>> stringListMap;
private WeakHashMap<IntegerList, WeakReference<IntegerList>> integerListMap;
private WeakHashMap<BigIntegerList, WeakReference<BigIntegerList>> bigIntegerListMap;
private WeakHashMap<Long, WeakReference<Long>> longMap;
private WeakHashMap<Double, WeakReference<Double>> doubleMap;
private WeakHashMap<TimeInterval, WeakReference<TimeInterval>> timeIntervalMap;
public DataIndex() {
charMap = new WeakHashMap<Character, WeakReference<Character>>();
floatMap = new WeakHashMap<Float, WeakReference<Float>>();
intMap = new WeakHashMap<Integer, WeakReference<Integer>>();
stringMap = new WeakHashMap<String, WeakReference<String>>();
booleanMap = new WeakHashMap<Boolean, WeakReference<Boolean>>();
stringListMap = new WeakHashMap<StringList, WeakReference<StringList>>();
integerListMap = new WeakHashMap<IntegerList, WeakReference<IntegerList>>();
bigIntegerListMap = new WeakHashMap<BigIntegerList, WeakReference<BigIntegerList>>();
longMap = new WeakHashMap<Long, WeakReference<Long>>();
doubleMap = new WeakHashMap<Double, WeakReference<Double>>();
timeIntervalMap = new WeakHashMap<TimeInterval, WeakReference<TimeInterval>>();
}
// modified methods
public int countEntries() {
int entries = 0;
entries += charMap.size();
entries += floatMap.size();
entries += intMap.size();
entries += stringMap.size();
entries += booleanMap.size();
entries += stringListMap.size();
entries += integerListMap.size();
entries += bigIntegerListMap.size();
entries += longMap.size();
entries += doubleMap.size();
entries += timeIntervalMap.size();
return entries;
}
public void clear() {
charMap.clear();
stringListMap.clear();
stringMap.clear();
floatMap.clear();
booleanMap.clear();
intMap.clear();
longMap.clear();
doubleMap.clear();
timeIntervalMap.clear();
integerListMap.clear();
bigIntegerListMap.clear();
}
// newly added methods
Character pushData(Character data) {
WeakReference<Character> value = charMap.get(data);
if (value == null) {
WeakReference<Character> weakRef = new WeakReference<Character>(data);
charMap.put(data, weakRef);
return data;
}
return value.get();
}
IntegerList pushData(IntegerList data) {
WeakReference<IntegerList> value = integerListMap.get(data);
if (value == null) {
WeakReference<IntegerList> weakRef = new WeakReference<IntegerList>(data);
integerListMap.put(data, weakRef);
return data;
}
return value.get();
}
BigIntegerList pushData(BigIntegerList data) {
WeakReference<BigIntegerList> value = bigIntegerListMap.get(data);
if (value == null) {
WeakReference<BigIntegerList> weakRef = new WeakReference<BigIntegerList>(data);
bigIntegerListMap.put(data, weakRef);
return data;
}
return value.get();
}
}
JUnit tests
To test all implemented classes, I created simple JUnit 4 tests. They actually test the creation from different inputs.
import org.junit.Test;
import static org.junit.Assert.*;
public class StringListTest {
@Test
public void testDefaultSeparator() {
StringList list = new StringList("aa,bb;cc");
assertEquals(list.size(), 3);
}
@Test
public void testGivenSeparator() {
StringList list = new StringList("aa/bb/cc", "/");
assertEquals(list.size(), 3);
}
@Test
public void testArray() {
StringList list = new StringList(new String[] {"aa", "bb", "cc"});
assertEquals(list.size(), 3);
}
@Test
public void testEmptyArray() {
StringList list = new StringList(new String[0]);
assertEquals(list.size(), 0);
}
}
import org.junit.Test;
import static org.junit.Assert.*;
public class IntegerListTest {
@Test
public void testCreatingListDefaultSeparator() {
IntegerList list = new IntegerList("11,22;33");
assertEquals(list.size(), 3);
}
@Test
public void testCreatingListGivenSeparator() {
IntegerList list = new IntegerList("11/22/33", "/");
assertEquals(list.size(), 3);
}
@Test
public void testCreatingListPrimitiveArray() {
IntegerList list = new IntegerList(new int[] {11, 22, 33});
assertEquals(list.size(), 3);
}
@Test
public void testCreatingListWrapperArray() {
IntegerList list = new IntegerList(new Integer[] {11, 22, 33});
assertEquals(list.size(), 3);
}
@Test
public void testEmptyPrimitiveArray() {
IntegerList list = new IntegerList(new int [0]);
assertEquals(list.size(), 0);
}
@Test
public void testEmptyWrapperArray() {
IntegerList list = new IntegerList(new Integer [0]);
assertEquals(list.size(), 0);
}
}
import java.math.BigInteger;
import static org.junit.Assert.*;
import org.junit.Test;
public class BigIntegerListTest {
@Test
public void testCreatingListDefaultSeparator() {
BigIntegerList list = new BigIntegerList("11,22;33");
assertEquals(list.size(), 3);
}
@Test
public void testCreatingListGivenSeparator() {
BigIntegerList list = new BigIntegerList("11/22/33", "/");
assertEquals(list.size(), 3);
}
@Test
public void testCreatingListArray() {
BigIntegerList list = new BigIntegerList(new BigInteger[] { new BigInteger("11"), new BigInteger("22"), new BigInteger("33")});
assertEquals(list.size(), 3);
}
@Test
public void testEmptyArray() {
BigIntegerList list = new BigIntegerList(new BigInteger [0]);
assertEquals(list.size(), 0);
}
}
Some design guidelines
- In order to not misuse the API, all implementation classes (all classes except AbstractList & NumberList) should be declared as final.
- You can get every item by its index using T getItem(int index) method. Because of backward compatibility, there is a String getString(int index) method in StringList implementation which only calls getItem(int) method. In any other cases getItem(int) will be preferred.
- In cases when we want to make some data conversions in “OurList” implementation constructors, appropriate private static <Type> parse(...) methods should be created.
Summary of changes in Gephi API
To summarize all changes in Gephi API the following will be done:
- addition of org.gephi.data.attributes.type.AbstractList class
- addition of org.gephi.data.attributes.type.NumberList class
- refactoring of org.gephi.data.attributes.type.StringList class
- addition of any implementation classes (in this document IntegerList and BigIntegerList)
- changes in org.gephi.data.attributes.api.AttributeType class:
- addition of enum constants
- modification in Object parse(String str) method
- refactoring of static AttributeType parse(Object obj) method
- changes in org.gephi.data.attributes.model.DataIndex class, for every new type we have to:
- add attribute
- add initialization in constructor
- modify countEntries() and clear() methods
- add appropriate <Type> pushData(<Type> data) method
- addition of JUnit tests
- possible changes in other APIs (maybe the need to add some additional functionality to AbstractType or NumberType class to support polymorphism, easy future changes, etc.)
4. What about … ?
Now it is the time to summarize the implementation and its advantages and disadvantages.
Backward compatibility
The only class that needs to be backward compatible is StringList. Proposed implementation of this class supports backward compatibility, so there should be no problems during the process of code integration.
Extensibility, future changes, additions
The API was designed to support easy extensibility. We can add easily any other kind of array of primitive, wrapper or other reference types.
Performance
In some places Java reflection must be used. Reflection have negative impact on performance, so I tried to minimize the use of reflection. Reflection is used only during the object creation in following cases:
- to invoke the constructor with one String parameter
- to invoke the valueOf(String) method
- to create an instance of array of given type
- to get items from primitive array
Memory consumption
As it was described before, the internal representation of data is array of generic type. All arrays of primitive types have to be converted into arrays of wrapper types. In spite of this fact, I don't think this can cause serious memory overhead.
Complexity & ease of implementation
The complexity represents mainly generics and reflection and it is concentrated in AbstractList & NumberList abstract classes. The Gephi developer will interact only with final implementations of these classes and all classes are tested through JUnit tests. So implementation of the other “OurList” classes should be easy.
Questions
Any advice, implementation limitations or maybe integration problems?

