Generate Annotations¶
This part of the package parser deals with the automatic generation of annotations that should help the user to quickly and efficiently correct various problems in an API. For this purpose, data from previously analyzed code is used and the generated annotations are stored in a JSON file.
Main Functions¶
flowchart LR
User --> gen_an["generate_annotations()"]
subgraph box[library-analyzer]
gen_an -.->|include| b["preprocess_usages()"]
gen_an -.->|include| c["generate_annotation_dict()"]
gen_an -.->|include| d["get_type_annotations()"]
end
generate_annotations()
¶
Serves as a central function that takes care of inputs and outputs and bundles all underlying functions. The exact structure and a detailed view of its interaction with sub-functions will be explained further later.
preprocess_usages()
¶
Used to clean up the transferred analysis results and prepare them for further use.
generate_annotation_dict()
¶
Generates the various annotations by consecutively calling the passed functions and collects the partial results.
get_[type]_annotation()
¶
This name serves only as a placeholder. [type]
must be replaced by one of the following keywords: {boundary, constant,
enum, optional, remove, required}.
Generates the annotations of the corresponding type. All functions of this type have the following
signature: get_type_annotations(UsageCountStore, API, AnnotationsStore)
.
UsageCountStore
and API
represent the set of data extracted from the analyzed code. The functionality and task of
the AnnotationStore
will be explained later.
Classes¶
We use the following dataclasses to store the related information. Compared to tuples, lists, and dictionaries, dataclasses offer better readability of the code and allow to define the interactions between functions in a clear and consistent way.
AnnotationStore¶
classDiagram
BaseAnnotation <|-- ConstantAnnotation
BaseAnnotation <|-- RemoveAnnotation
BaseAnnotation <|-- OptionalAnnotation
BaseAnnotation <|-- RequiredAnnotation
BaseAnnotation <|-- BoundaryAnnotation
BaseAnnotation <|-- EnumAnnotation
BoundaryAnnotation ..> Interval
EnumAnnotation ..> EnumPair
AnnotationStore "0..1" o-- "*" BaseAnnotation
class BaseAnnotation{
<<dataclass>>
String target
to_json()
}
class ConstantAnnotation{
<<dataclass>>
String defaultType
String defaultValue
}
class RemoveAnnotation{
<<dataclass>>
}
class OptionalAnnotation{
<<dataclass>>
String defaultType
String defaultValue
}
class RequiredAnnotation{
<<dataclass>>
}
class BoundaryAnnotation{
<<dataclass>>
String defaultType
List[Interval] interval
}
class EnumAnnotation{
<<dataclass>>
String enumName
List[EnumPair] pairs
}
class Interval{
<<dataclass>>
Boolean isDiscrete
Integer lowerIntervalLimit
Integer lowerLimitType
Integer upperIntervallLimit
Integer upperLimitType
to_json()
}
class EnumPair{
<<dataclass>>
String stringValue
String instanceName
to:json()
}
class AnnotationStore{
<<dataclass>>
List[ConstantAnnotation] constant
List[RemoveAnnotation] remove
List[OptionalAnnotation] optional
List[RequiredAnnotation] required
List[BoundaryAnnotation] boundary
List[EnumAnnotation] enum
__init__()
to_json()
}
The AnnotationStore class is used for the collection of the individual annotations. An instance of this class is passed
to the individual get_[type]_annotation()
functions. These then place their results in the list assigned to them.
ParameterType¶
classDiagram
ParameterType <.. ParameterInfo
class ParameterInfo{
ParametrType type
String value
String value_type
__init__()
}
class ParameterType{
<<enumeration>>
CONSTANT
OPTIONAL
REQUIRED
UNUSED
}
The ParameterInfo class is used to encapsulate the collected information for a given parameter which is generated in
the get_parameter_info()
function.
UsageStoreCount¶
classDiagram
class UsageStoreCount{
Counter[ClassQName] class_usages
Counter[FunctionQName] function_usages
Counter[ParameterQName] parameter_usages
dict[ParameterQName, Counter[StringifiedValue]] value_usages
__init__()
__eq__()
__hash__()
from_json(json) UsageStoreCount
add_class_usage(ClassQname, Integer)
add_function_usage(FunctionQName, Integer)
add_parameter_usages(ParameterQName)
add_value_usage(ParameterQName, StringifiedValue, Integer)
init_value(ParameterQName)
remove_class(ClassQName)
remove_function_usages(FunctionQname)
n_class_usages(ClassQName)
n_function_usages(FunctionQName)
n_parameter_usages(ParameterQName)
n_value_usages(ParameterQName, StringifiedValue)
to_json()
}
This class is a slimmed down version of the UsageStore class.
For the automatic generation of annotations, it is in most cases sufficient to know how often a certain element (class, function, parameter, value) is used. In the original class, there are more details stored for each usage, which leads to a more complex and harder to use structure.
The methods of this class can be roughly divided into three categories:
Setter¶
All methods that start with add, remove or init are used to manipulate the counter of a specific element. The QName ( QualifiedName) is used as a type of ID.
Getter¶
All methods that correspond to the form n_[type]_usages()
are used to read out the number of usages of an element. The
Qname is used in the same way here.
In-/Ouput¶
The methods to_json()
and from_json()
are used to store the content of the UsageCountStore as a json file or to read
it from one.
generate_annotations()
¶
This function acts as the central interface for other parts of the program.
Input¶
The function receives two filehandlers and a string that specifies a path.
The file handlers each point to a JSON file. This is the information collected during the code analysis. The path specifies where the results are to be stored.
preprocess_usages()
¶
This function needs to be executed before the data can be analyzed and performs the following three tasks:
-
remove_internal_usages()
Since we are only concerned with the use of outward-facing package elements, all elements that are used exclusively internally are excluded from consideration.
-
add_unused_api_elements()
Some API elements are not used at all and, therefore, do not appear in the listing of all used elements. However, it is necessary that they do for the following process of the program.
-
add_implicit_usages_of_default_value()
If no value is supplied for an optional parameter when a function is called, the default value of the parameter is used implicitly. This implicit use of a value does not appear in the usage data. However, it is necessary that they do so for later analysis.
All functions of the form get_[type]_annotation()
are passed in a list to the generate_annotation_dict()
function.
This function then calls them one after the other.
Finally, the data that resides in the AnnotationStore
instance is stored in the output JSON file.
stateDiagram
direction LR
[*] --> check_arguments
check_arguments --> collect_annotation_getter
collect_annotation_getter --> preprocess_usages
state generate_annotation_dict{
direction LR
preprocess_usages --> call_annotation_getter
}
call_annotation_getter --> dump_annotation_store
Testing¶
AnnotationStore
¶
For the AnnotationStore
and all associated classes there are automatic tests that ensure that the to_json()
methods
work properly. For this purpose, a test configuration of the individual classes is created, and their output is then
compared against the expected value.
UsageCountStore
¶
For this class, there are automatic checks for the input and output methods, that compare the actual result against the expected value. There are also various tests for every setter and getter in which the state of the object is checked after.
generate_annotations()
¶
For this function, all called functions are tested automatically. For this purpose, there is test data for each individual getter, for which the output is then compared against the expected value. We decided to split the test data and test the called functions as standalone, because the implementation of tests for new features would otherwise lead to a process of rechecking and updating all the previous tests and their data.