• Nie Znaleziono Wyników

3. Presentation of the Results

N/A
N/A
Protected

Academic year: 2022

Share "3. Presentation of the Results"

Copied!
9
0
0

Pełen tekst

(1)

Elżbieta Jasińska *, Edward Preweda *, Jan Ruchel *

Modeling Transaction Prices of Properties Based on Qualitative and Quantitative Features **

1. Introduction

A property is a special type of commodity, which has features which can be divided into measurable – such as area and immeasurable – such as street or dis- trict. Such a division requires dual approach to model the transaction price, de- pending on the type of accepted attributes. The existing methods of testing the real estate market are divided into qualitative and quantitative methods. The first approach is based on sociological techniques, for example the relevance of various features of the property for potential buyers, and based on experience, knowledge and intuition, which allow to express opinion on the development of market phe- nomena, if they are determined by immeasurable factors. Among the quantitative methods, dominate statistical analysis and econometric models, which depend on modeling transaction prices and real estate values [2]. The methods used so far in Poland, for the real estate market analysis, are mostly based on assigning numeri- cal values to qualitative characteristics. There is also used an approach, which sep- arates the quality from quantitative traits. After the initial analysis is done, e.g.

based on a ranking, the property value is corrected by quantitative methods.

The aim of this study is to extend the recently used methods on the real estate market by multi-dimensional analysis, allowing comparison of the impact of sin- gle variables, as well as the evaluation of these attributes, without assigning them arbitrary numerical values. The study was carried out by the C&RT (Classification and Regression Trees), which does not require scaling the attributes, which can describe by the quality scale. This proposal extends the existing research by taking into account the quantitative characteristics and at the same time qualitative ones (no need to assign numerical values to them.) This allows the introduction of location of the premises by the chosen street, which has been ignored so far.

* Faculty of Mining Surveying and Environmental Engineering, AGH University of Science and Technology, Krakow

** Article created as part of the statutory research in the Chair of the Geomatics No 11.11.150.006 51

(2)

The research was done on the basis of about 109 housing properties located in two selected districts of the City of Krakow. Those dwellings were sold between No- vember 2008 and March 2009.

2. Data Analysis Method by C&RT

Studies conducted by classification trees allow to incorporate the attributes that are difficult to convert to a quantitative scale (the name of a district, precinct number, street name). It is an unquestionable advantage, because it is expected that the final decision about buying a property or the transaction price will de- pend on the characteristics of this type. It is possible to omit using numerical val- ues for these characteristics without sacrificing the ones maintained in the analy- sis, by using the proposed model. The definition “classification tree” is used as a general term, which depending on how the variable is measured, enables to build discriminatory or regressive models. The discriminatory model assumes a qualitative discriminatory nature of the dependent variable, indicating its affin- ity to a particular class in the final node. The regression model (regression tree) comes up as a product of segmentation of the sample based on a dependent vari- able of a quantitative nature, in the final node, setting the average value and vari- ance of this variable [1].

The process of identifying the rules characterizing the market price was car- ried out by setting the following rules:

– trimming variance – if any of the nodes in the descendants of the variance does not decrease, then this division is eliminated in the process of trim- ming the tree;

– maximum number of nodes: 1000;

– the minimum number of nodes: 5;

– the proper tree size was based on the cost of a 10-fold cross-validation (un- derstood as a variance based on prediction of the continuous dependent variable); the tree with the best predictive ability was considered as a basis, while selecting attributes of price-setting behavior and researching the rela- tionships between them and the price of the transaction.

In the graph created this way, the ranking of variable was established. The va- lidity of the predictor, presented on it is the reciprocal of the total re-substitution cost in all nodes of the created tree. It is expressed in a scale of 0 to 1 (scaled so that its maximum is 1), which may be analogous to coefficients correlation [3], but the validity of the rankings cannot be determined whether the predictor influences the value of the dependent variable positively or negatively. It may also occur that the predictor, which has not been a criterion for final distribution of the tree, will get a high position in the ranking. It is possible, when such an attribute in most

(3)

divisions was second, in the possibility of reducing the variance in the lymph- -descendants. Despite that, ultimately it has not been placed on a graph, its pre- dictive ability is higher than such an attribute, which “used the power of predic- tive” in the first divisions of the tree, and then was not essential for further seg- mentation.

3. Presentation of the Results

Use of the C&RT algorithm does not require conversion scale features of the property to the quantitative scale, which allowed the inclusion of the characteris- tics “Street” to the test, which was usually ignored. An example of such an analy- sis is presented in the selected districts and they are Lagiewniki-Borek Falecki and Debniki.

The process of creating a final model, on the basis of which proceeded the modeling of the transaction price of a real estate, started from the creation of the most sophisticated model. The next step was trimming the tree, which was based on the value of replacement cost and value of cross-validation. These values in- crease as the trim is being continued (for the next trees, the number of terminal nodes decreases, further tree presents a more general criteria for the division). The value of re-substitution costs provide the dependent variable value (close or equal to the average cost) in the chosen leaf. It is interesting that the cost of cross- -validation for the tree number 8 (in the sample test) reaches a similar value to the first model, maintaining the general accuracy of increasing its value while the number of nodes is decreasing. This behavior confirms the aptness of this scheme, as the best final cut. Other trees provide less prognostic relevance in attempts (other sub samples v, drawn from the data), as shows figure 1.

The best division of a given node is the one which gives the greatest decrease in the cost of replacement. C&RT algorithm aims to allow the separation of high and low values of the dependent variable, which means that in a properly built model, to one node go the higher values and to the second node go values lower then the ones in the parent node [4].

On the base on the characteristics of a selected regression tree, features were ranked relatively to the ability of distributing real estates, as illustrated in figure 2.

Basing on this schema it is possible to distinguish four characteristics, with coeffi- cients higher than 0.8, “Rooms layout”, “Street” at which the premises is located,

“Standard” and “Neighborhood.” These are mainly the characteristics of a qualita- tive nature, a special case of which is the “Street”, it is the most difficult to express by numerical value. On the next position, the authors lined up: “Expenditure”,

“Surface”, “Floor”, “Year of Building”, belonging of a “Basement” to the premises, and the later we have – “Communication Access”.

(4)

The analyzed area is characterized by a highly developed network of commu- nication, both public (two tram depots) and roads (vehicle traffic). Features di- rectly related to the location of apartments in the building and age of the building

Substitution cost Cross-validaton cost 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Tree number 0

2000 4000 6000 8000 10000 12000 14000 16000 18000 20000

Cost

Chosen model

Fig. 1. Sequence chart of substitution costs and costs of cross-validation for the district Borek Falecki-Lagiewniki

Surface Expediture

Floor Year

of Building

Street Neighborhood

StandardCommuni- cation

Access Rooms

Layout Basement

Importance

Fig. 2. The validity of the attributes for the district Borek Falecki-Lagiewniki

(5)

are not as important as the standard of the residence or functionality, as most buildings have been built in a similar time.

While analyzing the regression tree, a segmentation of 45 house properties in the district Lagiewniki-Borek Falecki was carried out. Based on figure 3 condi- tional sentences, assigning accommodations to one of the selected sub-groups, were created:

– If the property is located by one of the streets: Borsucza, Chmielna, Slupska, Sowia, and “Rooms layout” is “Very favorable” or “ Favorable”

the market value of that property is 6700 z³/m2 ± 47 z³. The transaction price of the premises located by one of the mentioned streets, witch “Aver- age” or “Disadvantage” “Rooms layout” “is 6550 z³/m2± 237 z³.

– If the property is located by one of the streets: Brozka or Cegielniana, and its “Standard” is “High” the transaction price is 7000 z³/m2± 82 z³. The ana- lyzed database has three such properties.

– If the apartment is located by the street: Borkowskie B³onia, Brozka or Zdunow, and its “Standard” is not “High” its transaction price is 6728 z³/m2

± 45 z³.

ID=1 N=45

Med=6769 Var=16365

ID=2 N=15

Med=6640 Var=7733

ID=3 N=30

Med=6833 Var=8222

ID=16 N=11

Med=6909 Var=6280

ID=17 N=19

Med=6789 Var=4099

ID=4 N=9

Med=6700 Var=2222

ID=5 N=6

Med=6550 Var=2500

ID=18 N=3

Med=7000 Var=6666

ID=19 N=8

Med=6875 Var=1875

ID=28 N=7

Med=6728 Var=2040

ID=29 N=12

Med=6825 Var=1875 Street

= Borsucza, Chmielna, S³upska , Sowia

= Other

Rooms layout

= V.favorable,Favorable = Other

Standard

= High = Other

Street

= Bro¿ka , Cegielniana = Other

Street

= Borkowskie B³onia,

Bro¿ka, Zdunów = Other

Fig. 3. Regression tree system for real estate housing for the district Lagiewniki-Borek Falecki

(6)

Based on the diagram shown in figure 3, it can be seen that node 3 is split into nodes 16 and 17, and these nodes into 18 and 19, and later into 28 and 29. This nu- meration is a result of the most extensive model. The presented tree consists of 5 nodes and it is split into 6 end nodes, and is the smallest node fulfilling the rule of a single standard deviation. While dividing the set, the feature “Street” was in- cluded three times, which in predictors is ranked on the second place, and the fea- ture “Rooms layout” only once, which is the most important. On this basis, it can be concluded that the next considered for the distribution of nodes feature was just room apartment layout. As it was previously mentioned, the ranking of pre- dictors does not necessarily have to coincide with the graph tree, because it is a summary of all such lists, created during each division.

The graph presented in figure 4 can be presented in the form of a few condi- tional sentences such as:

– If the housing property is characterized by “Very favorable” “Communica- tion Access”, and its “Standard” is “High”, the transaction price of the property is 6100 z³/m2± 100 z³ – in the database ten such properties exists.

The transaction price of the same property, with the same “Communication Access” but with a lower “Standard” stands at 5780 z³/m2± 98 z³. Database accumulates five such properties.

ID=1 N=64

Med=5739 Var=62380

ID=2 N=15

Med=5993 Var=32622

ID=3 N=49

Med=5661 Var=45639

ID=8 N=13

Med=5854 Var=25562

ID=9 N=36

Med=5591 Var=34652

ID=18 N=32

Med=5631 Var=24648

ID=4 N=10

Med=6100 Var=10000

ID=5 N=5

Med=5780 Var=9600

ID=10 N=1

Med=6300 Var=0

ID=11 N=12

Med=5816 Var=9722

ID=20 N=6

Med=5783 Var=14722

ID=21 N=26

Med=5596 Var=20369

ID=19 N=4

Med=5275 Var=1875 Communication Access

= V. favorarable = Other

Standard

= High = Other

Street

= Obozowa, Zamkowa, Konfederacka, Zachodnia, Ruczaj, Szwedzka, Harasymowicza, Komuny Paryskiej

= Other

Street

= Obozowa = Other

Rooms layout

= Other = Adverse

Surface

< 43 => 43

Fig. 4. Regression tree for the Debniki district

(7)

– If “Communication Access” to this apartment is “Favorable”, “Average” or

“Adverse”, and the residence is located by one of the streets: Zamkowa, Konfederacka, Zachodnia, Szwedzka, Harasymowicza, Komuny Paryskiej or Ruczaj, its transaction unit price is 5816 z³/m2± 99 z³. Moreover, one can distinguish a property situated by Obozowa street, which cost 6300 z³/m2. – A dwelling located by other than the streets mentioned above, and its

“Room layout” is “Adverse” the unit price is on the level 5275 z³/m2± 43 z³.

If the “Rooms layout” is “Average”, “Favorable” or “Very Favorable”, and

“Surface” is smaller than 43 m2, its transaction unit price is 5783z³/m2± 121 z³. If its “Surface” comes at least to 43 m2, the price of 1 m2can be estimate as 5596 z³ ± 143 z³.

Analyzing the ranking of predictors for each stage of division, it is possible to rank the attributes of the property considering the ability to create homogeneous groups in terms of the transaction price. The graph shown in figure 5 systematizes the attributes forming the transaction price of the real estate. Again the qualities outweigh the measurable characteristics of the premises. The main criteria while selecting a dwelling are: “Street”, “Room layout”, “Neighborhood” and “Stan- dard”. Further items are: “Expenditure”, “Surface”, “Floor”, “Year of Buiding”,

“Basement” was ranged at the last level.

Surface

Expediture

Floor Year

of

Building Street

Neighborhood

StandardCommuni- cation

Access Room

s

Layout Basem ent

Fig. 5. The validity of attributes for the district Debniki

(8)

The analyzed district occupies an extensive area and combines the areas of pre-war buildings and a newly built ones. Dwellings also differ substantially. It is therefore difficult to speak about the same technologies or the same creation pe- riod. Because of diversification of the market, it is difficult to establish a clear cri- teria division, simultaneously it is extremely important for investors because it al- lows them to look for valuable areas for further investments. What is interesting, the territorial division (because of its location at a particular street) does not create a closed cluster, but it is a group of selected streets, from Szwedzka Street to Obozowa Street.

Combining predictor ranking, with the tree graph, can give us a searching analysis of the controlling processes at the local market. In this study, the features of a qualitative nature (“Street”, “Rooms layout”, or “Neighborhood”) are the most important criteria for the evolution of property prices. Apart from determin- ing which attributes are especially valuable to buyers, sets of homogeneous prop- erties in terms of unit price can be allocated.

4. Conclusion

Simultaneous consideration of the qualitative and quantitative features in re- search of the real estate market is a challenge for analysts, that is why we should pay attention to considerating concerning regression trees.

The introduction of the feature “Street” – as an attribute specifying the loca- tion of the property has made it possible to deepen the analysis. This feature is in the forefront of established predictor rankings, however, because of its nominal nature, it cannot be used in classical studies. Because such analysis are limited to an area of the one district, where the address of the real estate is not distin- guished, the only criterion for estimating the property is the neighborhood ex- pressed in a numerical scale.

In addition, it is worth noting that the qualitative features are on the first places in the ranking features that affect the market price of the property. Finding a place with a similar surfaces in the same neighborhood is not a problem, it is much more difficult narrow down the search to the same street and a similar stan- dard of finish. Therefore, C&RT models can be successfully used as a tool helping at work real estate experts, as they enable the selection of similar properties, even in very complex sets.

Also other sectors related with the real estate market, such as banks, develop- ers and real estate agencies could define the attractiveness of new investments on the basis of C&RT trees, or estimate how to modernize the existing properties.

(9)

Modeling Transaction Prices of Properties Based on Qualitative... 59

References

[1] Breiman L. et al.: Classification and Regression Trees. Chapman & Hill/CRC, New York 1998.

[2] Cellmer R.: Zasady i metody analizy elementów składowych rynku nieruchomości.

Olsztyn 1999.

[3] Czaja J., Preweda E.: Analiza statystyczna zmiennej losowej wielowymiarowej w aspekcie korelacji i predykcji. Geodezja, T. 6, z. 2, Kraków, 2000, pp. 129–145.

[4] Fayyad U.M, Piatetsky-Shapiro G., Smyth P.: Advances in Knowledge Discovery and Data Mining. AAAI Press, 1996.

[5] Kucharska-Stasiak E. (red.): Międzynarodowe Standardy Wyceny (wyd. pol- skie). Polska Federacja Stowarzyszeń Rzeczoznawców Majątkowych, War- szawa 2005.

[6] Kafkowski L.: Rynek nieruchomości w Polsce. Tweeger, Warszawa 2003.

[7] Statistica (data analysis software system), version 8.0. StatSoft, 2007 [on-line:]

www.statsoft.com.

Cytaty

Powiązane dokumenty

For the present study however it was considered to be important to be able to include a wider range of true wind angles and a varying true wind, both in speed and direction, to be

Distinguished segments received names on the basis of the presence of dominating features that appeared in each of them: – Segment I – „Comfortable” – because consumers from

Application of a co-design process on a nature-based intervention in the coastal system of Texel, the Netherlands.. d' Hont, Floortje;

Do rozwoju polskiego lecznictwa i przemysłu uzdro- wiskowego przyczyniła się działalność założonego w 1905 roku w Krakowie Polskiego Towarzystwa Balne- ologicznego

EDUKACJA BIOLOGICZNA I ŚRODOWISKOWA | ebis.ibe.edu.pl | ebis@ibe.edu.pl | © for the article by the Authors 2013 © for the edition by Instytut Badań Edukacyjnych 2013

Język polski, który zawiera w sobie całość naszej kultury i ogromnego dorobku narodu w różnych dziedzinach życia, gwałtownie się zmienia.. Zmiany te są wywołane

Publikowane dane wskazują, że nasiona linii/odmian żółtonasiennych charakteryzują się cieńszą okrywą nasienną, która zawiera na ogół nieco mniej włókna pokarmowego (w

związana jest z niedożywieniem, podczas gdy w krajach rozwijających się naglącym problemem staje się otyłość dzieci i młodzieży.. Jednym z największych zagrożeń dla